» Use pandoc as markdown processor

January 17, 2017 at 20:38 | blog.kummerlaender.eu | 23f629 | Adrian Kummerländer

The trigger but not the actual reason for this replacement of kramdown with pandoc was a strange generation issue with kramdown’s latest release.

All recent articles failed to generate anything more than an empty page. A quick check of the resulting HTML for those articles offered nothing out of the ordinary. After taking a close look at the articles in question I narrowed the set of failing articles down to those containing footnotes - tangentially I only started using footnotes a couple of articles ago i.e. this explained this part of the issue.

Some debugging of InputXSLT offered the following problem: Xerces-C generated an error message and stopped processing XML inputs containing nbsp non-blocking space characters in the implementation of the external-command function. This change in kramdown’s output can be traced back to enhancement issue 399. Obviously this is not a problem in kramdown but an issue in the way this static site generator is wrapping HTML inputs.

This problem should be solvable by adding appropriate namespace and doctype declarations to the markdown-generated HTML output. Instead I opted to perform the change to pandoc I had already planned for quite some time.

The choice fell on pandoc as it offers some additional markdown features as well as allowing for conversion to a rich set of document formats. i.e. features like printing articles as PDF using LaTeX are trivial to implement if pandoc is the markdown processor of choice. Furthermore page compilation is noticeably faster using pandoc.

One might note that this switch only solved the original issue by coincidence: Should pandoc start to generate non-blocking space characters the same problem will occur. But I have hopes that such a change would be configurable via pandoc’s plethora of configuration options. As this static site generator assumes everything to be XHTML I see no reason why I should not continue to treat HTML inputs as XML.

» Interpose open library function

February 20, 2016 at 22:30 | change | b3ef0f | Adrian Kummerländer

open is not as side effect free as I had imagined - i.e. if the flag O_TRUNC is passed it truncates the file contents alongside opening the file descriptor. In practice this is done by emacs prior to writing the new file content and as such needs to be intercepted so we can start tracking the file before it is changed.

Interposing open required some changes to make the library work without including fcntl.h. This header not only defines some of the flags we require to check if a library call actually is able to change files but also defines the open library function.

While implementing this change I noticed that the function interpositions implemented in C++ actually need to be declared as external "C" so their names do not get wrangled during compilation. I suspect that this was previously implicitly done for e.g. mmap and write by the included C standard library headers. However this did not work for open which is why all function interpositions are now explicitly declared external.

End result: emacs file changes are now tracked correctly.

» Implement static allocator for initialization

February 17, 2016 at 15:02 | change | af756d | Adrian Kummerländer

The previous interposition logic based on plain usage of dlsym analogously to various online examples led to a deadlock during neovim startup. This deadlock was caused by neovim’s custom memory allocation library jemalloc because it calls mmap during its initialization phase. The problem with calling mmap during initialization is that this already leads to executing libChangeLog’s mmap version whoes static actual_mmap function pointer is not initialized at this point in time. This is detected and leads to a call to dlsym to remedy this situation. Sadly dlsym in turn requires memory allocation using calloc which leads us back to initializing jemalloc and as such to a deadlock.

I first saw this as a bug in jemalloc which seemed to be confirmed by a short search in my search engine of choice. This prompted me to create an appropriate bug report which was dismissed as a problem in the way mmap was interposed and not as a bug in the library. Thus it seems to be accepted practice that it is not the responsibility of a custom memory allocator to cater to the initialization needs of other libraries relying on function interposition. This is of course a valid position as the whole issue is a kind of chicken and egg problem where both sides can be argued.

To cut to the chase I was left with the only option of working around this deadlock by adapting libChangeLog to call dlsym without relying on the wrapped application’s memory allocator of choice. The most straight forward way to do this is to provide another custom memory allocator alongside the payload function interpositions of mmap and friends.

init/alloc.cc implements such a selectively transparent memory allocator that offers a small static buffer for usage in the context of executing dlsym.The choice between forwarding memory allocation requests to the wrapped application’s allocator and using the static buffer is governed by init::dlsymContext. This tiny helper class maintains an dlsym_level counter by posing as a scope guard.

The end result of this extension to libChangeLog is that it now also works with applications using jemalloc such as neovim and should overall be much more robust during its initialization phase.

» Implement support for excluding arbitrary paths from tracking

February 14, 2016 at 20:52 | change | 1ffaf3 | Adrian Kummerländer

The library may be provided with a new-line separated list of regular expressions via the newly introduced CHANGE_LOG_IGNORE_PATTERN_PATH.

Any proposed tracking path that is matched by any of the provided patterns is excluded from change reporting. This functionality uses the Standard’s regular expression parsing functionality and as such doesn’t introduce any new dependencies. If no file path is provided or the provided file path is unreadable all paths will be tracked.

change was adapted to set CHANGE_LOG_IGNORE_PATTERN_PATH to .change_log_ignore which means that it will by default exclude any patterns provided via this file in the current working directory.

An example for such a file customized for hiding vim’s internal write logic may look as follows:

[0-9]+
[^~]*~
[.*\.viminfo
.*\.swp

Note that this is implemented in a fashion where it is not guaranteed that the full canonical path is checked against the patterns. It remains to be decided if this is enough for all common use cases of this new functionality.

tracking::PathMatcher lacks any explicit thread synchronization - according to my current knowledge this should not be necessary as we are only ever reading the private std::vector<std::regex> instance. If invalid regular expressions are provided they are silently ignored.

» Reimplemented TakeWhile and DropWhile in terms of ListIndex

February 17, 2015 at 22:47 | TypeAsValue | af5662 | Adrian Kummerländer

» Revamped partial function application

February 12, 2015 at 10:16 | TypeAsValue | ad27a7 | Adrian Kummerländer