Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

January 21 2018

Sebastian Dröge: Speeding up RGB to grayscale conversion in Rust by a factor of 2.2 – and various other multimedia related processing loops

January 20 2018

Christian Hergert: Builder happenings for January

I’ve been very busy with Builder since returning from the holidays. As mentioned previously, we’ve moved to gitlab. I’m very happy about it. I can see how this is going to improve the engagement and communication between our existing community and help us keep new contributors.

I made two releases of Builder so far this month. That included both a new stable build (which flatpak users are already using) and a new snapshot for those on developer operating systems like Fedora Rawhide.

The vast majority of my work this month has been on stabilization efforts. Builder is already a very large project. Every moving part we add makes this Rube Goldberg machine just a bit more difficult to maintain. I’ve tried to focus my time on things that are brittle and either improve or replace the designs. I’ve also fixed a good number of memory leaks and safety issues. However, the memory overhead of clang sort of casts a large shadow on all that work. We really need to get clang out of process one of these days.

Over the past couple years, our coding style evolved thanks to new features like g_autoptr() and friends. Every time I come across old-style code during my bug hunts, I clean it up.

Builder learned how to automatically install Flatpak SDK Extensions. These can save you a bunch of time when building your application if you have a complex stack. Things like Rust and Mono can just be pulled in and copied into your app rather than compiled from source on your machine. In doing so, every app that uses the technology can share objects in the OSTree repository, saving disk space and network transfer.

That allowed me to create a new template, a GNOME C♯ template. It uses the Mono SDK extension and gtk-sharp for 3.x. If you want to help here, work on a omni-sharp language server plugin for us!

A new C++ template using Gtkmm was added. Given that I don’t have a lot of recent experience with Gtkmm, it’d be nice to have someone from that community come in and make sure things are in good shape.

I also did some cleanup on our code-indexer to avoid threading in our API. Creating plugins on threads turned out to be rather disastrous, so now we try extra hard to keep things on the main thread with the typical async/finish function pairs.

I created a new messages panel to elevate warnings to the user without them having to run Builder from a terminal. If you want an easy project to work on, we need to go find interesting calls to g_warning() and use ide_context_warning() instead.

Our flatpak plugin now tries extra hard to avoid downloads. Those were really annoying for people when opening builder. It took some troubleshooting in flatpak-builder, and that is fixed now.

In the process of fixing the extraneous downloading I realized we could start bundling flatpak-builder with Builder. After a couple of fixes to flatpak-builder Builder Nightly no longer requires flatpak-builder on the host. That’s one less thing to go wrong for people going through the newcomers work-flow.

We just landed the beginning of a go-langserver plugin. It seems like the language server for Go is pretty new though. We only have a symbol resolver thus far.

I found a fun bug in Vala that caused const gchar * const * parameters to async functions to turn into gchar **, int. It was promptly fixed upstream for us (thanks Rico).

Some 350 commits have landed this month so far, most of them around stabilizing Builder. It’s a good time to start playing with the Nightly branch if you’re into that.

Oh, and after some 33 years on Earth, I finally needed glasses. So I look educated now.

January 18 2018

Jim Hall: Programming with ncurses

Morten Welinder: Security From Whom, Indeed

So Spectre and Meltdown happened.

That was completely predicable, so much so that I, in fact, did predict that side-channel attacks, including those coming via javascript run in a browser, was the thing to look out for. (This was in the context of pointing out that pushing Wayland as a security improvement over plain old X11 was misguided.)

I recall being told that such attacks were basically nation-state level due to cost, complexity, and required target information. How is that prediction working out for you?

January 17 2018

Philip Chimento: Announcing Flapjack

Andy Wingo: instruction explosion in guile

Greetings, fellow Schemers and compiler nerds: I bring fresh nargery!

instruction explosion

A couple years ago I made a list of compiler tasks for Guile. Most of these are still open, but I've been chipping away at the one labeled "instruction explosion":

Now we get more to the compiler side of things. Currently in Guile's VM there are instructions like vector-ref. This is a little silly: there are also instructions to branch on the type of an object (br-if-tc7 in this case), to get the vector's length, and to do a branching integer comparison. Really we should replace vector-ref with a combination of these test-and-branches, with real control flow in the function, and then the actual ref should use some more primitive unchecked memory reference instruction. Optimization could end up hoisting everything but the primitive unchecked memory reference, while preserving safety, which would be a win. But probably in most cases optimization wouldn't manage to do this, which would be a lose overall because you have more instruction dispatch.

Well, this transformation is something we need for native compilation anyway. I would accept a patch to do this kind of transformation on the master branch, after version 2.2.0 has forked. In theory this would remove most all high level instructions from the VM, making the bytecode closer to a virtual CPU, and likewise making it easier for the compiler to emit native code as it's working at a lower level.

Now that I'm getting close to finished I wanted to share some thoughts. Previous progress reports on the mailing list.

a simple loop

As an example, consider this loop that sums the 32-bit floats in a bytevector. I've annotated the code with lines and columns so that you can correspond different pieces to the assembly.

   0       8   12     19
 +-v-------v---v------v-
 |
1| (use-modules (rnrs bytevectors))
2| (define (f32v-sum bv)
3|   (let lp ((n 0) (sum 0.0))
4|     (if (< n (bytevector-length bv))
5|         (lp (+ n 4)
6|             (+ sum (bytevector-ieee-single-native-ref bv n)))
7|          sum)))

The assembly for the loop before instruction explosion went like this:

L1:
  17    (handle-interrupts)     at (unknown file):5:12
  18    (uadd/immediate 0 1 4)
  19    (bv-f32-ref 1 3 1)      at (unknown file):6:19
  20    (fadd 2 2 1)            at (unknown file):6:12
  21    (s64<? 0 4)             at (unknown file):4:8
  22    (jnl 8)                ;; -> L4
  23    (mov 1 0)               at (unknown file):5:8
  24    (j -7)                 ;; -> L1

So, already Guile's compiler has hoisted the (bytevector-length bv) and unboxed the loop index n and accumulator sum. This work aims to simplify further by exploding bv-f32-ref.

exploding the loop

In practice, instruction explosion happens in CPS conversion, as we are converting the Scheme-like Tree-IL language down to the CPS soup language. When we see a Tree-Il primcall (a call to a known primitive), instead of lowering it to a corresponding CPS primcall, we inline a whole blob of code.

In the concrete case of bv-f32-ref, we'd inline it with something like the following:

(unless (and (heap-object? bv)
             (eq? (heap-type-tag bv) %bytevector-tag))
  (error "not a bytevector" bv))
(define len (word-ref bv 1))
(define ptr (word-ref bv 2))
(unless (and (<= 4 len)
             (<= idx (- len 4)))
  (error "out of range" idx))
(f32-ref ptr len)

As you can see, there are four branches hidden in the bv-f32-ref: two to check that the object is a bytevector, and two to check that the index is within range. In this explanation we assume that the offset idx is already unboxed, but actually unboxing the index ends up being part of this work as well.

One of the goals of instruction explosion was that by breaking the operation into a number of smaller, more orthogonal parts, native code generation would be easier, because the compiler would only have to know about those small bits. However without an optimizing compiler, it would be better to reify a call out to a specialized bv-f32-ref runtime routine instead of inlining all of this code -- probably whatever language you write your runtime routine in (C, rust, whatever) will do a better job optimizing than your compiler will.

But with an optimizing compiler, there is the possibility of removing possibly everything but the f32-ref. Guile doesn't quite get there, but almost; here's the post-explosion optimized assembly of the inner loop of f32v-sum:

L1:
  27    (handle-interrupts)
  28    (tag-fixnum 1 2)
  29    (s64<? 2 4)             at (unknown file):4:8
  30    (jnl 15)               ;; -> L5
  31    (uadd/immediate 0 2 4)  at (unknown file):5:12
  32    (u64<? 2 7)             at (unknown file):6:19
  33    (jnl 5)                ;; -> L2
  34    (f32-ref 2 5 2)
  35    (fadd 3 3 2)            at (unknown file):6:12
  36    (mov 2 0)               at (unknown file):5:8
  37    (j -10)                ;; -> L1

good things

The first thing to note is that unlike the "before" code, there's no instruction in this loop that can throw an exception. Neat.

Next, note that there's no type check on the bytevector; the peeled iteration preceding the loop already proved that the bytevector is a bytevector.

And indeed there's no reference to the bytevector at all in the loop! The value being dereferenced in (f32-ref 2 5 2) is a raw pointer. (Read this instruction as, "sp[2] = *(float*)((byte*)sp[5] + (uptrdiff_t)sp[2])".) The compiler does something interesting; the f32-ref CPS primcall actually takes three arguments: the garbage-collected object protecting the pointer, the pointer itself, and the offset. The object itself doesn't appear in the residual code, but including it in the f32-ref primcall's inputs keeps it alive as long as the f32-ref itself is alive.

bad things

Then there are the limitations. Firstly, instruction 28 tags the u64 loop index as a fixnum, but never uses the result. Why is this here? Sadly it's because the value is used in the bailout at L2. Recall this pseudocode:

(unless (and (<= 4 len)
             (<= idx (- len 4)))
  (error "out of range" idx))

Here the error ends up lowering to a throw CPS term that the compiler recognizes as a bailout and renders out-of-line; cool. But it uses idx as an argument, as a tagged SCM value. The compiler untags the loop index, but has to keep a tagged version around for the error cases.

The right fix is probably some kind of allocation sinking pass that sinks the tag-fixnum to the bailouts. Oh well.

Additionally, there are two tests in the loop. Are both necessary? Turns out, yes :( Imagine you have a bytevector of length 1025. The loop continues until the last ref at offset 1024, which is within bounds of the bytevector but there's one one byte available at that point, so we need to throw an exception at this point. The compiler did as good a job as we could expect it to do.

is is worth it? where to now?

On the one hand, instruction explosion is a step sideways. The code is more optimal, but it's more instructions. Because Guile currently has a bytecode VM, that means more total interpreter overhead. Testing on a 40-megabyte bytevector of 32-bit floats, the exploded f32v-sum completes in 115 milliseconds compared to around 97 for the earlier version.

On the other hand, it is very easy to imagine how to compile these instructions to native code, either ahead-of-time or via a simple template JIT. You practically just have to look up the instructions in the corresponding ISA reference, is all. The result should perform quite well.

I will probably take a whack at a simple template JIT first that does no register allocation, then ahead-of-time compilation with register allocation. Getting the AOT-compiled artifacts to dynamically link with runtime routines is a sufficient pain in my mind that I will put it off a bit until later. I also need to figure out a good strategy for truly polymorphic operations like general integer addition; probably involving inline caches.

So that's where we're at :) Thanks for reading, and happy hacking in Guile in 2018!

January 16 2018

Umang Jain: GNOME Photos: Happenings
Federico Mena-Quintero: Help needed for librsvg 2.42.1
Alexander Larsson: Fixing flatpak startup times

Daniel Espinosa: GXml is near for ABI stability

Today I managed to create a patch to provide ABI stability for GXml and any other Vala library.

ABI is one of the more important aspect in a library; allows to produce binaries fixing issues and add features while the applications, depending on it, don’t need to be recompiled

Vala libraries need to add annotations in order to produce binaries interoperable with applications linked against an old version, Gee is the best example.

Now with the refered work, you can easily manage ABI without worrying about annotations, just take care on the order your virtual/abstract methods and properties are declared in your source code, in order to preserve your library’s ABI.

January 15 2018

Daniel Espinosa: ABI stability for GXml

I’m taking a deep travel across Vala code; trying to figure out how things work. With my resent work on abstract methods for compact classes, may I have an idea on how to provide ABI stability to GXml.

GXml have lot of interfaces for DOM4, implemented in classes, like Gom* series. But they are a lot, so go for each and add annotations, like Gee did, to improve ABI, is a hard work.

I think is better to improve Vala code to produce ABI stability from the beginning; this will help GXml, GSVG (implementing W3C SVG 1.1 interfaces) and GSVGtk, to have abstract classes and interfaces with good ABI stability without change a line of code in them.

In the process, we can have reproducible API, that is: same C header from compilation to compilation of Vala code and when you add new API. Of course, this means that you should follow basic rules when write Vala code, but no more than the ones on the C side, well may be a few ones. When this is in place, you may add your library header to your repository to track changes to it; once a new API has been added, you should be able to take care about ABI and API, to make sure they are consistent over time.

 

Productive language for GObject, GType and non-GType based software.

Christian Hergert: Musings on bug trackers

Over the past couple of weeks we migraged jsonrpc-glib, template-glib, libdazzle, and gnome-builder all to gitlab.gnome.org. It’s been a pretty smooth process, thanks to a lot of hard work by a few wonderfully accommodating people.

I love bugzilla, I really do. I’ve used it nearly my entire career in free software. I know it well, I like the command line tool integration. But I’ve never had a day in bugzilla where I managed to resolve/triage/close nearly 100 issues. I managed to do that today with our gitlab instance and I didn’t even mean to.

So I guess that’s something. Sometimes modern tooling can have a drastic effect rather immediately.

January 13 2018

Sebastian Dröge: How to write GStreamer Elements in Rust Part 1: A Video Filter for converting RGB to grayscale

January 12 2018

Federico Mena-Quintero: Librsvg gets Continuous Integration
Miguel de Icaza: Interactive Line Editing in .NET
Julita Inca: 2017: My FLOSS​ Year in Review

January 11 2018

Andy Wingo: spectre and the end of langsec

I remember in 2008 seeing Gerald Sussman, creator of the Scheme language, resignedly describing a sea change in the MIT computer science curriculum. In response to a question from the audience, he said:

The work of engineers used to be about taking small parts that they understood entirely and using simple techniques to compose them into larger things that do what they want.

But programming now isn't so much like that. Nowadays you muck around with incomprehensible or nonexistent man pages for software you don't know who wrote. You have to do basic science on your libraries to see how they work, trying out different inputs and seeing how the code reacts. This is a fundamentally different job.

Like many I was profoundly saddened by this analysis. I want to believe in constructive correctness, in math and in proofs. And so with the rise of functional programming, I thought that this historical slide from reason towards observation was just that, historical, and that the "safe" languages had a compelling value that would be evident eventually: that "another world is possible".

In particular I found solace in "langsec", an approach to assessing and ensuring system security in terms of constructively correct programs. One obvious application is parsing of untrusted input, and indeed the langsec.org website appears to emphasize this domain as one in which a programming languages approach can be fruitful. It is, after all, a truth universally acknowledged, that a program with good use of data types, will be free from many common bugs. So far so good, and so far so successful.

The basis of language security is starting from a programming language with a well-defined, easy-to-understand semantics. From there you can prove (formally or informally) interesting security properties about particular programs. For example, if a program has a secret k, but some untrusted subcomponent C of it should not have access to k, one can prove if k can or cannot leak to C. This approach is taken, for example, by Google's Caja compiler to isolate components from each other, even when they run in the context of the same web page.

But the Spectre and Meltdown attacks have seriously set back this endeavor. One manifestation of the Spectre vulnerability is that code running in a process can now read the entirety of its address space, bypassing invariants of the language in which it is written, even if it is written in a "safe" language. This is currently being used by JavaScript programs to exfiltrate passwords from a browser's password manager, or bitcoin wallets.

Mathematically, in terms of the semantics of e.g. JavaScript, these attacks should not be possible. But practically, they work. Spectre shows us that the building blocks provided to us by Intel, ARM, and all the rest are no longer "small parts understood entirely"; that instead now we have to do "basic science" on our CPUs and memory hierarchies to know what they do.

What's worse, we need to do basic science to come up with adequate mitigations to the Spectre vulnerabilities (side-channel exfiltration of results of speculative execution). Retpolines, poisons and masks, et cetera: none of these are proven to work. They are simply observed to be effective on current hardware. Indeed mitigations are anathema to the correctness-by-construction: if you can prove that a problem doesn't exist, what is there to mitigate?

Spectre is not the first crack in the edifice of practical program correctness. In particular, timing side channels are rarely captured in language semantics. But I think it's fair to say that Spectre is the most devastating vulnerability in the langsec approach to security that has ever been uncovered.

Where do we go from here? I see but two options. One is to attempt to make the behavior of the machines targetted by secure language implementations behave rigorously as architecturally specified, and in no other way. This is the approach taken by all of the deployed mitigations (retpolines, poisoned pointers, masked accesses): modify the compiler and runtime to prevent the CPU from speculating through vulnerable indirect branches (prevent speculative execution), or from using fetched values in further speculative fetches (prevent this particular side channel). I think we are missing a model and a proof that these mitigations restore target architectural semantics, though.

However if we did have a model of what a CPU does, we have another opportunity, which is to incorporate that model in a semantics of the target language of a compiler (e.g. micro-x86 versus x86). It could be that this model produces a co-evolution of the target architectures as well, whereby Intel decides to disclose and expose more of its microarchitecture to user code. Cacheing and other microarchitectural side-effects would then become explicit rather than transparent.

Rich Hickey has this thing where he talks about "simple versus easy". Both of them sound good but for him, only "simple" is good whereas "easy" is bad. It's the sort of subjective distinction that can lead to an endless string of Worse Is Better Is Worse Bourbaki papers, according to the perspective of the author. Anyway transparent caching in the CPU has been marvelously easy for most application developers and fantastically beneficial from a performance perspective. People needing constant-time operations have complained, of course, but that kind of person always complains. Could it be, though, that actually there is some other, better-is-better kind of simplicity that should replace the all-pervasive, now-treacherous transparent cacheing?

I don't know. All I will say is that an ad-hoc approach to determining which branches and loads are safe and which are not is not a plan that inspires confidence. Godspeed to the langsec faithful in these dark times.

January 10 2018

Richard Hughes: Phoning home after updating firmware?

Jiri Eischmann: Flathub, Snap, Fedora: what is more up-to-date?

Yesterday I wondered how Flathub and Snap are doing in terms of proving up-to-date applications and how they compare to Fedora, a traditional and quite progressive Linux distribution.

The comparison is not extremely scientific. I picked (pretty much randomly) 16 apps which are in all three sources, looked up the available version and when it was updated. This subset is not very large. Flathub tends to have popular open source applications well known from Linux distributions. Snap lacks many of these, but has quite a few apps outside the traditional Linux desktop world. And at last Fedora doesn’t have many multimedia apps which include patent-protected codecs (VLC, Kdenlive, MPV,…).

To find out the app version and last update date I relied on Github repositories for Flathub, on uApp explorer for Snap, and on Fedora packages app for Fedora (27).

Looking at the table, you can see that the differences are not big. Flathub generally offers the most up-to-date apps having the latest versions of apps in the list except for missing one minor update for Eye of GNOME, it was also usually the first one to offer it.

The results of Fedora are pretty surprising to me. One of the biggest advantages of Flatpak and Snap they claim they have over traditional Linux distributions is that they ship the latest and greatest, but apparently at least in desktop apps Fedora is not behind and offers the latest versions as well (with two exceptions in this list) and often very close behind or sometimes even before the two competitors.

Of course a distribution model like Flatpak still keeps other advantages (and also disadvantages): sandboxing, you can run it on older distributions (e.g. RHEL 7) etc., but if you’re only after the latest versions Flathub and Snap don’t give you a big advantage over Fedora repositories. And if the Fedora Project offers a Flatpak repository built from Fedora packages as we plan, it can actually be a hit because it will be able to offer up-to-date applications and in a much larger number than current Flathub or Snap Store.

App Flathub Snap Fedora Darktable 2.4.0, Dec 24 2.2.5, Oct 25 2.4.0, Jan 1 Blender 2.79, Sept 26 2.79, Sept 11 2.79, Sept 30 Corebird 1.7.3, Nov 19 1.7.3, Nov 20 1.7.3, Nov 28 GnuCach 2.6.19, Jan 5 2.6.19, Dec 18 2.6.18, Oct 30 Inkscape 0.92.2, Aug 9 0.92.2, Aug 19 0.92.2, Oct 1 LibreOffice 5.4.4, Dec 20 5.4.3.2, Dec 1 5.4.4.2, Dec 19 Nextcloud client 2.3.3, Nov 24 2.3.3, Dec 11 2.3.3, Oct 5 Picard 1.4.2, Sept 27 1.4.2, Oct 7 1.3.2, Jul 14 GNOME Calendar 3.26.2, Oct 5 3.26.0, Sept 22 3.26.2, Oct 11 Evince 3.26.0, Nov 9 3.26.0, Nov 29 3.26.0, Sept 18 Eye of GNOME 3.26.1, Nov 7 3.26.2, Nov 29 3.26.2, Nov 15 gedit 3.22.1, Jul 31 3.22.1, Nov 29 3.22.1, Aug 3 Glade 3.20.2, Dec 15 3.20.0, Nov 29 3.20.2, Dec 10 GNOME Characters 3.26.2, Nov 7 3.26.2, Nov 29 3.26.2, Nov 11 GIMP 2.8.22, Oct 17 2.8.22, Dec 11 2.8.22, Nov 11 HexChat 2.2.14, Apr 12 2.2.14, Feb 5 2.2.14, Dec 12 2016

January 09 2018

Jehan Pagès: How to fix broken custom file icons (Nautilus, GIO)
Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl