A Virtual Homehttps://blog.avirtualhome.com/2018-08-19T15:08:00-04:00Memory problems with MozJPEG and Pillow2018-08-19T15:08:00-04:002018-08-19T15:08:00-04:00Peter van der Doestag:blog.avirtualhome.com,2018-08-19:/memory-problems-with-jpg-files-and-pillow/<p>After implementing MozJPEG to create smaller files we noticed it would not always work as we got the message &ldquo;I/O suspension not supported in scan&nbsp;optimization&rdquo;</p><p>We implemented <a href="https://github.com/mozilla/mozjpeg">MozJPEG</a> to be used with <a href="https://python-pillow.org/">Pillow 4.x</a> to create smaller thumbnails of files uploaded by users, when we noticed that sometimes this process did not work. We looked into our logs and noticed the following error message <code>I/O suspension not supported in scan optimization</code>. Time to enter the <span class="caps">GSO</span> workflow, <span class="caps">GSO</span> stands for Google Stack Overflow, in other words search the Internet. The error message results in links to the source code of MozJPEG, not very helpful at&nbsp;first.</p> <p>Time to brush up on my C knowledge, <span class="caps">OK</span> I never programmed in C but that doesn&rsquo;t stop me from going through the&nbsp;source.</p> <p>The error message is defined in <code>jerror.h</code></p> <div class="highlight"><pre><span></span><span class="cp">#endif</span> <span class="n">JMESSAGE</span><span class="p">(</span><span class="n">JERR_BAD_PARAM</span><span class="p">,</span> <span class="s">&quot;Bogus parameter&quot;</span><span class="p">)</span> <span class="n">JMESSAGE</span><span class="p">(</span><span class="n">JERR_BAD_PARAM_VALUE</span><span class="p">,</span> <span class="s">&quot;Bogus parameter value&quot;</span><span class="p">)</span> <span class="n">JMESSAGE</span><span class="p">(</span><span class="n">JERR_UNSUPPORTED_SUSPEND</span><span class="p">,</span> <span class="s">&quot;I/O suspension not supported in scan optimization&quot;</span><span class="p">)</span> <span class="cp">#ifdef JMAKE_ENUM_LIST</span> </pre></div> <p>So now we have to find the <code>JERR_UNSUPPORTED_SUSPEND</code> constant. Luckily it appears only in one file, <code>jcmaster.c</code></p> <div class="highlight"><pre><span></span><span class="k">while</span> <span class="p">(</span><span class="n">size</span> <span class="o">&gt;=</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">)</span> <span class="p">{</span> <span class="n">MEMCOPY</span><span class="p">(</span><span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">next_output_byte</span><span class="p">,</span> <span class="n">src</span><span class="p">,</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">);</span> <span class="n">src</span> <span class="o">+=</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">;</span> <span class="n">size</span> <span class="o">-=</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">;</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">next_output_byte</span> <span class="o">+=</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span><span class="p">;</span> <span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">free_in_buffer</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="p">(</span><span class="o">*</span><span class="n">cinfo</span><span class="o">-&gt;</span><span class="n">dest</span><span class="o">-&gt;</span><span class="n">empty_output_buffer</span><span class="p">)(</span><span class="n">cinfo</span><span class="p">))</span> <span class="n">ERREXIT</span><span class="p">(</span><span class="n">cinfo</span><span class="p">,</span> <span class="n">JERR_UNSUPPORTED_SUSPEND</span><span class="p">);</span> <span class="p">}</span> </pre></div> <p>Cool, it seems to be related to memory cleanup, just my guess because of the empty_output_buffer line. Now we have to find out where Pillow sets the buffersize for saving an <span class="caps">JPEG</span>&nbsp;image.</p> <p>The file <code>PIL/JpegImagePlugin.py</code> is used for all functions related to a <span class="caps">JPEG</span> image, and this includes&nbsp;saving.</p> <p>The whole save method is a bit large to post here, but the part below determines the buffer size and it&rsquo;s used to save the image. The buffer size is set to be holding the entire image&nbsp;file.</p> <div class="highlight"><pre><span></span><span class="n">bufsize</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">if</span> <span class="n">optimize</span> <span class="ow">or</span> <span class="n">progressive</span><span class="p">:</span> <span class="c1"># CMYK can be bigger</span> <span class="k">if</span> <span class="n">im</span><span class="o">.</span><span class="n">mode</span> <span class="o">==</span> <span class="s1">&#39;CMYK&#39;</span><span class="p">:</span> <span class="n">bufsize</span> <span class="o">=</span> <span class="mi">4</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="c1"># keep sets quality to 0, but the actual value may be high.</span> <span class="k">elif</span> <span class="n">quality</span> <span class="o">&gt;=</span> <span class="mi">95</span> <span class="ow">or</span> <span class="n">quality</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span> <span class="n">bufsize</span> <span class="o">=</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">else</span><span class="p">:</span> <span class="n">bufsize</span> <span class="o">=</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="c1"># The exif info needs to be written as one block, + APP1, + one spare byte.</span> <span class="c1"># Ensure that our buffer is big enough. Same with the icc_profile block.</span> <span class="n">bufsize</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span><span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span><span class="p">,</span> <span class="n">bufsize</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">info</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&quot;exif&quot;</span><span class="p">,</span> <span class="sa">b</span><span class="s2">&quot;&quot;</span><span class="p">))</span> <span class="o">+</span> <span class="mi">5</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">extra</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">_save</span><span class="p">(</span><span class="n">im</span><span class="p">,</span> <span class="n">fp</span><span class="p">,</span> <span class="p">[(</span><span class="s2">&quot;jpeg&quot;</span><span class="p">,</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span><span class="o">+</span><span class="n">im</span><span class="o">.</span><span class="n">size</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">rawmode</span><span class="p">)],</span> <span class="n">bufsize</span><span class="p">)</span> </pre></div> <p>I don&rsquo;t want to change the Pillow source itself cause of potential issues whenever we upgrade Pillow in the future. So the best thing I can do is modify the <code>ImageFile.MAXBLOCK</code>, not that big of deal I&nbsp;think.</p> <p>I came up with the following&nbsp;solution</p> <div class="highlight"><pre><span></span><span class="n">new_maxblock</span> <span class="o">=</span> <span class="mi">3</span> <span class="o">*</span> <span class="n">image</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">image</span><span class="o">.</span><span class="n">size</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="c1"># ...3 bytes per every pixel in the image</span> <span class="n">old_maxblock</span> <span class="o">=</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span> <span class="k">if</span> <span class="n">new_maxblock</span> <span class="o">&gt;</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span><span class="p">:</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span> <span class="o">=</span> <span class="n">new_maxblock</span> <span class="n">requested_size</span> <span class="o">=</span> <span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">width</span><span class="p">),</span> <span class="nb">int</span><span class="p">(</span><span class="n">height</span><span class="p">))</span> <span class="n">image</span><span class="o">.</span><span class="n">thumbnail</span><span class="p">(</span><span class="n">requested_size</span><span class="p">,</span> <span class="n">Image</span><span class="o">.</span><span class="n">ANTIALIAS</span><span class="p">)</span> <span class="n">image</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">thumb_file</span><span class="p">,</span> <span class="s2">&quot;JPEG&quot;</span><span class="p">,</span> <span class="n">progressive</span><span class="o">=</span><span class="bp">True</span><span class="p">,)</span> <span class="n">ImageFile</span><span class="o">.</span><span class="n">MAXBLOCK</span> <span class="o">=</span> <span class="n">old_maxblock</span> </pre></div> <p>We determine a new max block size, as most <span class="caps">JPEG</span> files are 24bits color (<span class="caps">RGB</span>), we need 3 bytes per pixel. This might be overkill in certain situations but at times I prefer overkill over not having having the&nbsp;thumbnail.</p> <p>After implementing the above solution the <code>I/O suspension not supported in scan optimization</code> error message has not been seen in the&nbsp;logs.</p>Replace JPEG libraries with MozJPEG2018-03-28T09:03:00-04:002018-03-28T09:03:00-04:00Peter van der Doestag:blog.avirtualhome.com,2018-03-28:/replace-jpeg-libraries-with-mozjpeg/<p>For a project in Python we had to squeeze more bytes out of <span class="caps">JPG</span> files using Pillow. Currently MozJPEG fits that bill but there isn&rsquo;t a repository available to install it on&nbsp;Ubuntu.</p><p>For an image heavy site we were building we needed to squeeze more bytes out of the <span class="caps">JPEG</span> files. We use <a href="http://pillow.readthedocs.io/en/latest/">Pillow</a> within our Python project to create thumbnails which in turn uses the <span class="caps">JPEG</span> libraries installed on your system, so we had to look for a 1-on-1 replacement of the system jpeg&nbsp;libraries.</p> <p>For Ubuntu you can use <a href="https://libjpeg-turbo.org/">libjpeg-turbo</a> but using <a href="https://github.com/mozilla/mozjpeg">MozJPEG</a> by Mozilla makes the thumbnails even smaller. The only problem we ran into was the fact there is no repository you can add in Ubuntu and therefore we had to compile MozJPEG&nbsp;manually.</p> <p>If you just want to skip the steps go to <a href="#tldr">tl;dr</a>. All the steps need to be ran as&nbsp;root.</p> <h4 id="install-requirements">Install&nbsp;requirements</h4> <p>To compile MozJPEG you need to install some&nbsp;requirements.</p> <div class="highlight"><pre><span></span>apt -y install build-essential cmake libtool autoconf automake m4 nasm pkg-config </pre></div> <p>and then configure the dynamic linker run-time bindings <div class="highlight"><pre><span></span>ldconfig /usr/lib </pre></div></p> <h4 id="get-mozjpeg-source">Get MozJPEG&nbsp;source</h4> <p>We&rsquo;ll be working with version 3.2 of the MozJPEG library. <div class="highlight"><pre><span></span>wget https://github.com/mozilla/mozjpeg/archive/v3.2.tar.gz tar xf v3.2.tar.gz </pre></div></p> <h4 id="configure-and-install">Configure and&nbsp;Install</h4> <p>Before we can configure and install we have to create the configuration. Go to the directory you extract the archive&nbsp;in.</p> <div class="highlight"><pre><span></span><span class="nb">cd</span> mozjpeg-3.2 autoreconf -fiv </pre></div> <p>To keep source and build separate we&rsquo;ll do the build in it&rsquo;s own&nbsp;directory.</p> <div class="highlight"><pre><span></span>mkdir build <span class="nb">cd</span> build sh ../configure --with-jpeg8 make install <span class="nv">libdir</span><span class="o">=</span>/usr/lib/x86_64-linux-gnu <span class="nv">prefix</span><span class="o">=</span>/usr </pre></div> <p>We have to copy one source file over as it&rsquo;s not included in the build. <div class="highlight"><pre><span></span>cp ../jpegint.h /usr/include/jpegint.h </pre></div></p> <p>That&rsquo;s it, now almost any program on your server that use the <span class="caps">JPEG</span> libraries to create images will be using MozJPEG and making the files much smaller than with the standard or even libjpeg-turbo&nbsp;libraries.</p> <h4 id="tldr"><span class="caps">TL</span>;<span class="caps">DR</span></h4> <p>The script below will do all the above steps. Remember to run this as&nbsp;root.</p> <div class="highlight"><pre><span></span><span class="c1">#/bin/sh</span> apt -y install build-essential cmake libtool autoconf automake m4 nasm pkg-config ldconfig /usr/lib rm -rf mozjpeg-3.2 wget https://github.com/mozilla/mozjpeg/archive/v3.2.tar.gz tar xf v3.2.tar.gz <span class="nb">cd</span> mozjpeg-3.2 autoreconf -fiv mkdir build <span class="nb">cd</span> build sh ../configure --with-jpeg8 make install <span class="nv">libdir</span><span class="o">=</span>/usr/lib/x86_64-linux-gnu <span class="nv">prefix</span><span class="o">=</span>/usr cp ../jpegint.h /usr/include/jpegint.h </pre></div> <h4 id="bonus">Bonus</h4> <p>If you use Pillow in your Python project and it was already installed you need to reinstall it, we ran into issues where after reinstalling it still would not use the MozJPEG libraries. In order to make that work, we had to recompile Pillow. The code below will recompile&nbsp;pillow</p> <div class="highlight"><pre><span></span>pip install --upgrade --no-cache-dir --force-reinstall --no-binary :all: --compile -v Pillow </pre></div>Create a custom 410 error page in NGINX2018-03-23T08:03:00-04:002018-03-23T08:03:00-04:00Peter van der Doestag:blog.avirtualhome.com,2018-03-23:/create-custom-410-error-page-nginx/<p>I had a need to create a 410 page for a whole bunch pages. As it turns out it was not as easy as it&nbsp;sounds</p><p>When converting this blog from WordPress to Pelican I decided I just ditch a whole bunch of articles I had written in WordPress. According to several articles on the net it is best practice to have pages you delete return a 410 page. For a better user experience I wanted used to land on a page that looks like part of the&nbsp;blog.</p> <p>As I&rsquo;m using <span class="caps">NGINX</span> you can utilize the map function to create a new variable whose value depends on values of one or more of the source variables specified in the first parameter. I created a file called <code>old_request.nginx</code> in the directory <code>/etc/nginx/snippets/</code></p> <div class="highlight"><pre><span></span>map $request_uri $gone_var { /the-avh-amazon-plugin-has-reached-its-end-of-life/ 1; /wordpress-plugin-update-avh-first-defense-against-spam-v3-0/ 1; /end-of-the-avh-amazon-plugin/ 1; /wordpress-plugin-update-avh-extended-categories/ 1; } </pre></div> <p>This file just needs to be included in your configuration file, and if you want to have a custom 410 error page you just have to tell <span class="caps">NGINX</span> which file to use when it encounters a 410&nbsp;error.</p> <div class="highlight"><pre><span></span>include snippets/old_request.nginx; server { .... error_page 404 /404.html; error_page 410 /410.html; if ($gone_var) { return 410; } location / { .... } } </pre></div> <p>Easy enough, <strong><span class="caps">NOT</span></strong>. The above configuration does not work. If you try to browse one of the URLs mentioned in the the <code>old_request.nginx</code> file you get the default <span class="caps">NGINX</span> 410 error page and not the file you said it should&nbsp;show.</p> <p>To fix this we have to use a named location. A named location has the <code>@</code> prefix. Such a location is not used for a regular request processing, but instead used for request&nbsp;redirection.</p> <div class="highlight"><pre><span></span>include snippets/old_request.nginx; server { .... error_page 404 /404.html; error_page 410 @gone; if ($gone_var) { return 410; } location @gone { rewrite ^(.*)$ /410.html break; } location / { .... } } </pre></div> <p>And now your custom 410 error page&nbsp;works.</p>Move from WordPress to Pelican2018-03-11T22:15:24-04:002018-03-17T23:03:00-04:00Peter van der Doestag:blog.avirtualhome.com,2018-03-11:/move-from-wordpress-to-pelican/<p>I have been playing with the thought of moving my blog from WordPress to a static website generator for years and I finally pulled the&nbsp;trigger.</p><p>I have been playing with the thought of moving my blog from WordPress to a static website generator for&nbsp;years.</p> <p>I never really had the energy to start this projects, for several reasons. I know I could never find a theme that is completely to my liking so I would have to change it, possibly altering some code. There were some generator written in <span class="caps">PHP</span>, my language of choice for a long time, but they seemed not very mature. The best known static web site generator was Jekyll, written in Ruby and I really couldn&rsquo;t get the energy to start learning Ruby on the side. The other reason that was keeping me from moving was how to move all my articles from WordPress to the static&nbsp;generator.</p> <p>I started working for <a href="https://oneilinteractive.com/">ONeil Interactive</a> in 2017. We build websites for the home building industry and the language of choice is Python. We started looking into building an internal site for documenting our projects. We develop in Python and it was quickly decided the static generator should be written in Python as it would make it easier to extend the generator if&nbsp;needed.</p> <p>As a result of this project I decided to bite the bullet for my personal site as well. After a quick look at the site <a href="https://www.staticgen.com/">StaticGen</a> and some quick research I decided to go with <a href="https://github.com/getpelican/pelican">Pelican</a>. Of course I needed a theme, and there is never a theme that completely satisfies my needs but my starter theme is <a href="https://github.com/kdeldycke/plumage">Plumage by Kevin Deldycke</a>. I&rsquo;ve modified it to work with Bootstrap 4, and used the <a href="https://bootswatch.com/slate/">Slate theme by Bootswatch</a>.</p> <p>I only moved two articles from my old blog as the rest of the articles were not visited that often. I still have them in the database so if it&rsquo;s ever needed I can pull them&nbsp;up.</p> <p>Oh in case you were wondering, for the project at work we decided to go with <a href="https://www.mkdocs.org/">MkDocs</a></p>