A Virtual Home - MozJPEGhttps://blog.avirtualhome.com/2018-08-19T15:08:00-04:00Memory problems with MozJPEG and Pillow2018-08-19T15:08:00-04:002018-08-19T15:08:00-04:00Peter van der Doestag:blog.avirtualhome.com,2018-08-19:/memory-problems-with-jpg-files-and-pillow/<p>After implementing MozJPEG to create smaller files we noticed it would not always work as we got the message &ldquo;I/O suspension not supported in scan&nbsp;optimization&rdquo;</p><p>We implemented <a href="https://github.com/mozilla/mozjpeg">MozJPEG</a> to be used with <a href="https://python-pillow.org/">Pillow 4.x</a> to create smaller thumbnails of files uploaded by users, when we noticed that sometimes this process did not work. We looked into our logs and noticed the following error message <code>I/O suspension not supported in scan optimization</code>. Time to enter the <span class="caps">GSO</span> workflow, <span class="caps">GSO</span> stands for Google Stack Overflow, in other words search the Internet. The error message results in links to the source code of MozJPEG, not very helpful at&nbsp;first.</p> <p>Time to brush up on my C knowledge, <span class="caps">OK</span> I never programmed in C but that doesn&rsquo;t stop me from going through the&nbsp;source.</p> <p>The error message is defined in <code>jerror.h</code></p> <div class="highlight"><pre><span></span><span class="cp">#endif</span> <span class="n">JMESSAGE</span><span class="p">(</span><span class="n">JERR_BAD_PARAM</span><span class="p">,</span> <span class="s">&quot;Bogus parameter&quot;</span><span class="p">)</span> <span class="n">JMESSAGE</span><span class="p">(</span><span class="n">JERR_BAD_PARAM_VALUE</span><span class="p">,</span> <span class="s">&quot;Bogus parameter value&quot;</span><span class="p">)</span> <span class="n">JMESSAGE</span><span class="p">(</span><span class="n">JERR_UNSUPPORTED_SUSPEND</span><span class="p">,</span> <span class="s">&quot;I/O suspension not supported in scan optimization&quot;</span><span class="p">)</span> <span class="cp">#ifdef JMAKE_ENUM_LIST</span> </pre></div> <p>So now we have to find the <code>JERR_UNSUPPORTED_SUSPEND</code> constant. Luckily it appears only in one file, <code>jcmaster.c</code></p> <div class="highlight"><pre><span></span><span class="k">while</span> <span class="p">(</span><span class="n">size</span> <span class="o">&gt;=</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">)</span> <span class="p">{</span> <span class="n">MEMCOPY</span><span class="p">(</span><span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">next_output_byte</span><span class="p">,</span> <span class="n">src</span><span class="p">,</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">);</span> <span class="n">src</span> <span class="o">+=</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">;</span> <span class="n">size</span> <span class="o">-=</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">;</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">next_output_byte</span> <span class="o">+=</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">;</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="p">(</span><span class="o">*</span><span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">empty_output_buffer</span><span class="p">)(</span><span class="n">cinfo</span><span class="p">))</span> <span class="n">ERREXIT</span><span class="p">(</span><span class="n">cinfo</span><span class="p">,</span> <span class="n">JERR_UNSUPPORTED_SUSPEND</span><span class="p">);</span> <span class="p">}</span> </pre></div> <p>Cool, it seems to be related to memory cleanup, just my guess because of the empty_output_buffer line. Now we have to find out where Pillow sets the buffersize for saving an <span class="caps">JPEG</span>&nbsp;image.</p> <p>The file <code>PIL/JpegImagePlugin.py</code> is used for all functions related to a <span class="caps">JPEG</span> image, and this includes&nbsp;saving.</p> <p>The whole save method is a bit large to post here, but the part below determines the buffer size and it&rsquo;s used to save the image. The buffer size is set to be holding the entire image&nbsp;file.</p> <div class="highlight"><pre><span></span><span class="n">bufsize</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">if</span> <span class="n">optimize</span> <span class="ow">or</span> <span class="n">progressive</span><span class="p">:</span> <span class="c1"># CMYK can be bigger</span> <span class="k">if</span> <span class="n">im</span><span class="o">.</span><span class="n">mode</span> <span class="o">==</span> <span class="s1">&#39;CMYK&#39;</span><span class="p">:</span> <span class="n">bufsize</span> <span class="o">=</span> <span class="mi">4</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="c1"># keep sets quality to 0, but the actual value may be high.</span> <span class="k">elif</span> <span class="n">quality</span> <span class="o">&gt;=</span> <span class="mi">95</span> <span class="ow">or</span> <span class="n">quality</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span> <span class="n">bufsize</span> <span class="o">=</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">else</span><span class="p">:</span> <span class="n">bufsize</span> <span class="o">=</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="c1"># The exif info needs to be written as one block, + APP1, + one spare byte.</span> <span class="c1"># Ensure that our buffer is big enough. Same with the icc_profile block.</span> <span class="n">bufsize</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span><span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span><span class="p">,</span> <span class="n">bufsize</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">info</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&quot;exif&quot;</span><span class="p">,</span> <span class="sa">b</span><span class="s2">&quot;&quot;</span><span class="p">))</span> <span class="o">+</span> <span class="mi">5</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">extra</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">_save</span><span class="p">(</span><span class="n">im</span><span class="p">,</span> <span class="n">fp</span><span class="p">,</span> <span class="p">[(</span><span class="s2">&quot;jpeg&quot;</span><span class="p">,</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span><span class="o">+</span><span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">rawmode</span><span class="p">)],</span> <span class="n">bufsize</span><span class="p">)</span> </pre></div> <p>I don&rsquo;t want to change the Pillow source itself cause of potential issues whenever we upgrade Pillow in the future. So the best thing I can do is modify the <code>ImageFile.MAXBLOCK</code>, not that big of deal I&nbsp;think.</p> <p>I came up with the following&nbsp;solution</p> <div class="highlight"><pre><span></span><span class="n">new_maxblock</span> <span class="o">=</span> <span class="mi">3</span> <span class="o">*</span> <span class="n">image</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">image</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="c1"># ...3 bytes per every pixel in the image</span> <span class="n">old_maxblock</span> <span class="o">=</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span> <span class="k">if</span> <span class="n">new_maxblock</span> <span class="o">&gt;</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span><span class="p">:</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span> <span class="o">=</span> <span class="n">new_maxblock</span> <span class="n">requested_size</span> <span class="o">=</span> <span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">width</span><span class="p">),</span> <span class="nb">int</span><span class="p">(</span><span class="n">height</span><span class="p">))</span> <span class="n">image</span><span class="o">.</span><span class="n">thumbnail</span><span class="p">(</span><span class="n">requested_size</span><span class="p">,</span> <span class="n">Image</span><span class="o">.</span><span class="n">ANTIALIAS</span><span class="p">)</span> <span class="n">image</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">thumb_file</span><span class="p">,</span> <span class="s2">&quot;JPEG&quot;</span><span class="p">,</span> <span class="n">progressive</span><span class="o">=</span><span class="bp">True</span><span class="p">,)</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span> <span class="o">=</span> <span class="n">old_maxblock</span> </pre></div> <p>We determine a new max block size, as most <span class="caps">JPEG</span> files are 24bits color (<span class="caps">RGB</span>), we need 3 bytes per pixel. This might be overkill in certain situations but at times I prefer overkill over not having having the&nbsp;thumbnail.</p> <p>After implementing the above solution the <code>I/O suspension not supported in scan optimization</code> error message has not been seen in the&nbsp;logs.</p>Replace JPEG libraries with MozJPEG2018-03-28T09:03:00-04:002018-03-28T09:03:00-04:00Peter van der Doestag:blog.avirtualhome.com,2018-03-28:/replace-jpeg-libraries-with-mozjpeg/<p>For a project in Python we had to squeeze more bytes out of <span class="caps">JPG</span> files using Pillow. Currently MozJPEG fits that bill but there isn&rsquo;t a repository available to install it on&nbsp;Ubuntu.</p><p>For an image heavy site we were building we needed to squeeze more bytes out of the <span class="caps">JPEG</span> files. We use <a href="http://pillow.readthedocs.io/en/latest/">Pillow</a> within our Python project to create thumbnails which in turn uses the <span class="caps">JPEG</span> libraries installed on your system, so we had to look for a 1-on-1 replacement of the system jpeg&nbsp;libraries.</p> <p>For Ubuntu you can use <a href="https://libjpeg-turbo.org/">libjpeg-turbo</a> but using <a href="https://github.com/mozilla/mozjpeg">MozJPEG</a> by Mozilla makes the thumbnails even smaller. The only problem we ran into was the fact there is no repository you can add in Ubuntu and therefore we had to compile MozJPEG&nbsp;manually.</p> <p>If you just want to skip the steps go to <a href="#tldr">tl;dr</a>. All the steps need to be ran as&nbsp;root.</p> <h4 id="install-requirements">Install&nbsp;requirements</h4> <p>To compile MozJPEG you need to install some&nbsp;requirements.</p> <div class="highlight"><pre><span></span>apt -y install build-essential cmake libtool autoconf automake m4 nasm pkg-config </pre></div> <p>and then configure the dynamic linker run-time bindings <div class="highlight"><pre><span></span>ldconfig /usr/lib </pre></div></p> <h4 id="get-mozjpeg-source">Get MozJPEG&nbsp;source</h4> <p>We&rsquo;ll be working with version 3.2 of the MozJPEG library. <div class="highlight"><pre><span></span>wget https://github.com/mozilla/mozjpeg/archive/v3.2.tar.gz tar xf v3.2.tar.gz </pre></div></p> <h4 id="configure-and-install">Configure and&nbsp;Install</h4> <p>Before we can configure and install we have to create the configuration. Go to the directory you extract the archive&nbsp;in.</p> <div class="highlight"><pre><span></span><span class="nb">cd</span> mozjpeg-3.2 autoreconf -fiv </pre></div> <p>To keep source and build separate we&rsquo;ll do the build in it&rsquo;s own&nbsp;directory.</p> <div class="highlight"><pre><span></span>mkdir build <span class="nb">cd</span> build sh ../configure --with-jpeg8 make install <span class="nv">libdir</span><span class="o">=</span>/usr/lib/x86_64-linux-gnu <span class="nv">prefix</span><span class="o">=</span>/usr </pre></div> <p>We have to copy one source file over as it&rsquo;s not included in the build. <div class="highlight"><pre><span></span>cp ../jpegint.h /usr/include/jpegint.h </pre></div></p> <p>That&rsquo;s it, now almost any program on your server that use the <span class="caps">JPEG</span> libraries to create images will be using MozJPEG and making the files much smaller than with the standard or even libjpeg-turbo&nbsp;libraries.</p> <h4 id="tldr"><span class="caps">TL</span>;<span class="caps">DR</span></h4> <p>The script below will do all the above steps. Remember to run this as&nbsp;root.</p> <div class="highlight"><pre><span></span><span class="c1">#/bin/sh</span> apt -y install build-essential cmake libtool autoconf automake m4 nasm pkg-config ldconfig /usr/lib rm -rf mozjpeg-3.2 wget https://github.com/mozilla/mozjpeg/archive/v3.2.tar.gz tar xf v3.2.tar.gz <span class="nb">cd</span> mozjpeg-3.2 autoreconf -fiv mkdir build <span class="nb">cd</span> build sh ../configure --with-jpeg8 make install <span class="nv">libdir</span><span class="o">=</span>/usr/lib/x86_64-linux-gnu <span class="nv">prefix</span><span class="o">=</span>/usr cp ../jpegint.h /usr/include/jpegint.h </pre></div> <h4 id="bonus">Bonus</h4> <p>If you use Pillow in your Python project and it was already installed you need to reinstall it, we ran into issues where after reinstalling it still would not use the MozJPEG libraries. In order to make that work, we had to recompile Pillow. The code below will recompile&nbsp;pillow</p> <div class="highlight"><pre><span></span>pip install --upgrade --no-cache-dir --force-reinstall --no-binary :all: --compile -v Pillow </pre></div>