Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Make's underlying design is great (it builds a DAG of dependencies, which allows for parallel walking of the graph), but there's a number of practical problems that make it a royal pain to use as a generic build system:

1. Using make in a CI system doesn't really work, because of the way it handles conditional building based on mtime. Sometimes you just don't want the condition to be based on mtime, but rather a deterministic hash, or something else entirely.

2. Make is _really_ hard to use to try to compose a large build system from small re-usable steps. If you try to break it up into multiple Makefiles, you lose all of the benefits of a single connected graph. Read the article about why recursive make is harmful: http://aegis.sourceforge.net/auug97.pdf

3. Let's be honest, nobody really wants to learn Makefile syntax.

As a shameless plug, I built a tool similar to Make and redo, but just allows you to describe everything as a set of executables. It still builds a DAG of the dependencies, and allows you to compose massive build systems from smaller components: https://github.com/ejholmes/walk. You can use this to build anything your heart desires, as long as you can describe it as a graph of dependencies.



Also, I'm always surprised how little mention there is of graphs when we start talking about build systems. If you're making a build system (for literally anything) the DAG is your best friend. This is how all major build system tools work (make, terraform, systemd (yes, it's a build system when you think about it)) and it's how we're able to parallelize execution so easily. It's important to be conscious of the fact that this is what your doing when your making a build system; connecting a graph.

Highly recommend reading https://ocw.mit.edu/courses/electrical-engineering-and-compu... for some theory on parallel execution for graphs, if your interested in things like this.


> As a shameless plug, I built a tool similar to Make and redo

Also be sure to have a look at tup, which operates vastly more efficient by simply walking the DAG in the opposite direction:

http://gittup.org/tup/

That is, instead of looking what you want to build and checking all timestamps of dependency files, it can use e.g. inotify, then know exactly which files changed, and rebuilds everything that depends on those files.

Moreover, it performs the modification check only once, at the beginning. During the build, it doesn't need to re-check everything, because it already knows which files were recreated.


Good idea, but it misses some clever tricks. What happens when a dependency does change, but not in a way that actually matters? Buck has a nice feature where Java libraries are only fully recompiled when the API of their dependencies change, since everything is dynamically linked. https://www.youtube.com/watch?v=uvNI_E0ZgZU


tup can be made to ignore unimportant changes. I had to look up the syntax, hopefully I got it correct.

I have foo.c, bar.c and a rule like:

  : foreach foo.c bar.c |> ^o^ gcc -c %f -o %o |> %B.o {objs}
  : {objs} |> gcc %f -o %o |> baz
The ^o^ part tells it to not trigger the next rules if there's been no change to the output. So if you just change your source files to use an updated license, reformatted it for clarity, etc., then nothing else will happen.

I used this with some of my literate code projects where I had tup running org-tangle for me (via an emacs script). If I'd only updated the documentation, and the code hadn't changed, nothing else would build. If I'd only changed unimportant parts of the code that generated the same object files, no new binary or library would be built.


> What happens when a dependency does change, but not in a way that actually matters

I believe that tup does have some features in that direction, but I may be mistaken.


This looks awesome! Definitely very similar to walk(1). Thanks for sharing!


I can't recall what precisely, but I tried tup with a tiny project to build, and I found it couldn't handle the dependency structure. And apparently (as of 1 years ago) it wasn't supposed to be handled.


That sounds strange, especially for a tiny project. What kind of dependency structure did you have?


I really can't recall but it was a very stiff roadblock. You just couldn't depend on some file or variable ..

Maybe someone who used tup more recently can pitch in


> Using make in a CI system doesn't really work, because of the way it handles conditional building based on mtime.

Any CI I've seen starts from a fresh repo checkout and rebuilds everything every time, so it's not an issue with practice.

OTOH, probably all my projects were small enough for it to not hurt when the CI builds everything from scratch every time. I might look at this from another angle if the things I worked on were mega-LOC C++ programs, and not kilo-LOC Go programs.


That works, until your build system is sufficiently large or time consuming or not C/C++ like make was originally built for. For example, at my company we have a build system for building all of our Amazon Machine Images (AMIs). It doesn't make sense to re-build images unless they're dependencies have changed, but mtime just doesn't make any sense for a build system like this. Trying to coerce make into doing what we wanted was like pulling teeth.


Why didn't you just take make sources and fix them for <whatever> instead of mtime() check, making it a command line option or environment variable? Or simply .MTIME target, which would return int-like string of rank, defaulting to mtime() if not declared.


I guess it's because of his point #3.

Honestly, if I were to modify make for some custom behavior, I would look for another tool¹ or create yet another builder tool. Because although I already know most of the syntax, point #2 is an incredibly big problem, the GP missed point #4 in that make does not manage external dependencies², and the syntax is not bad only for learning, but for using too.

1 - Almost everybody seems to agree with me on this. The only problem is that we have all migrated to language constrained tools. This is a problem worth fixing.

2 - Wether it is installing them or simply testing if they are there, make can't do any. Hello autotools, with it syntax set to extreme cumbersomeness.


I don't know if you really understand Make. Even the original article misses point about what Make is really about in recommending `.PHONY` targets.

Make is about "making files." Or to be a little more semantically specific, Make is about "processes that transform files to make new files based upon their dependencies". It really doesn't matter that your file is a C source file, or some unpreprocessed CSS, or a template, or an AMI. To your AMI example, if you specify a dependency (or dependency chain, DAG) as a time-stamped (or set of) file, you can get make to rebuild the AME for you along with any other supporting or intermediate files.

IMO, the suckless guys are masters at writing deceptively complex but highly readable & concise Makefiles. Here's plan9 workalike utility library [0], a util-linux/busybox package of binaries [1], a window manager [2], a terminal emulator [3], and webkit2 based web browser [4]. I highly recommend you study them if you're looking to up your Make game.

[0] http://git.suckless.org/9base/tree/Makefile

[1] http://git.suckless.org/sbase/tree/Makefile

[2] http://git.suckless.org/dwm/tree/Makefile

[3] http://git.suckless.org/st/tree/Makefile

[4] http://git.suckless.org/surf/tree/Makefile


The first example uses recursive make, breaking the graph :(.

I'm not saying you can't use make (we did use make before we switched to walk), it's just more painful for non C/C++ build systems. All we really want from a build systems, is a generic tool for describing, and walking, a graph of dependencies.


Recursion does not imply a circular dependency which is most people's biggest concern with Make recursion. A graph with loops is still a graph. A broken graph is actually 2 graphs.

If you're careful (and even if you're not), loops in your dependency graph are usually a non-issue. And if you use a single `Makefile` it'll detect the circular dependency and try to ignore it.

Let's take an example processing Sass CSS files. You just need 2 folders in your project, `css/` and `scss/`, and a `Makefile` to process all your Sass files into CSS.

   # Locate our source assets, save a list
   SRC = $(shell ls scss/*.scss 2>/dev/null)
   # Specify a filename transformation from scss to css and convert the SRC list
   OUT = $(SRC:.scss=.css)
   # Define rule of how a scss to css transformation is supposed to happen
   css/%.css : scss/%.scss
        sass $<:$@
   # Make the `all` target depend on the OUT list
   all : $(OUT)

It takes just 5 lines to teach `make` how process Sass files. This would process any `.scss` file you dropped in `scss/` and save it in `css/`.


Sorry but that makefile is repeating the absolutely most common mistake in make.

If you use @import in your scss file the css will not be rebuilt if only the imported file is modified.

So no. It does not take 5 lines to teach make how to build Sass. To solve this you either need to teach make about sass @import, or teach sass about make and let it generate makefiles (like gcc -MMd). Or simply just use sass --watch for incremental builds and take the recompile hit if you need to restart it for whatever reason.

(While I'm at it...Another thing missing from that makefile is source maps (.css.map) generated by the sass compiler. It's not only one css file being generated from each sass file. That will complicate the rules even further)


You don't need to (specifically) teach make about `@import`. It's another scss file and it (hopefully) should be a part of your SRC list. Yes, having something like `gcc -MMd` would be better but that's sass for you.

To fix the `.css.map` issue is simple. You can't use the pattern you've seen for yacc/bison, though. The `$@` var would have both files in it.

   %.tab.c %.tab.h: %.y
           bison -d $<
Instead, simply adjust the pattern to rule to be the functional equivalent:

   css/%.css : scss/%.scss
        sass $<:$@
   css/%.css.map : scss/%.scss
        @touch $@


Even if the imported file is part of SRC list, any files importing it will not be rebuilt if only that file is modified.


Ahh. Got you. And easy fix again. Add `$(SRC)` to the pattern rule.

   css/%.css : $(SRC) scss/%.scss
        sass $<:$@
   css/%.css.map : $(SRC) scss/%.scss
        @touch $@


Now you will get correct results. But now if you change just one file, you will always rebuild everything, even files that don't import the changed file. Maybe not a big deal for SASS as you usually don't have mega LOCs of SASS and compiling is fairly quick. Just want to highlight that make isn't always as simple as it looks at second glance.


Yet another error is that if you remove one scss file, the old generated css file will not be removed after a rebuild. If the next build step does something like wildcard-include all *.css-files you will get problems.


Blind includes are a dumb to do (it's one easy way to exploit a source codebase). The better way is to generate your include from, you know, the actual way your codebase is. Use `$(OUT)` to build your include statement. That var says, "load any of my files", not the lazy/dangerous form of, "load any files that might actually be there".

And yes, by that logic, `SRC = $(shell ls scss/*.scss)` is dumb. That is not lost upon me.


> To solve this you either need to....

Or teach a third-party tool to output a list of @imported files for a given Sass file. Then this list can be used as that Sass file's dependencies.


Most languages don't build as quickly as go. Having to wait tens of minutes for a clean build is not unusual.


That brings back not-so-fond memories of trying to compile Servo on my notebook. :)

By the way, I just noticed that compilation times are another argument for microservices.


CircleCI (at least) caches build dependencies for speed, but it does it based on specific directories, not timestamps. As a result, it's not as fast as it could be because unless you munge your build to fit, it doesn't cache intermediate build products of your own code.


> so it's not an issue with practice.

I meant "in practice", of course...


> 2. Make is _really_ hard to use to try to compose a large build system ... recursive make is harmful

Note that "recursive make is harmful" does not argue against multiple Makefiles!

There's nothing wrong with using multiple Makefiles per-se, as long as they include rather than call each other. In other words, the article just says that Makefiles should use sub-Makefiles via "include" rather than executing those through separate (recursive) call to Make.

However, I agree that composition of Makefiles is still a pain, given that the included Makefile must be aware that it is executed from another (parent/grandparent) directory.


> 1. Using make in a CI system doesn't really work, because of the way it handles conditional building based on mtime

This is my #1 gripe with Make, and many other build systems as well. There are so many flaws in the timestamp approach. Most are easily fixed with cryptographic hashes.

I like the OCaml build system OPAM for that matter, it internally just stores checksum. I believe it also uses timestamps to speed things up, but only for equality comparison (not for older/newer comparisons which may easily lead to wrong result).


Any reason the hashes need to be cryptographic? Is there a security consideration I'm missing?


Depending on how deep down the rabbit hole you want to go, you could argue that e.g. using md5 could allow "an attacker" to submit e.g. an innocuous image that conflicts with another source file, causing it to be excluded from the build, causing a security hole to be opened.

But that's kinda silly.

I might argue in favor of (fast) cryptographic hash algorithms in general since they're fairly well understood / implemented / hardware accelerated / tend to have extremely "balanced" random output thus less likely to accidentally conflict... but that's about all I can think of.


Makefile syntax is pretty simple, once you're used to it.

Multiple Makefiles works fairly simply with the include directive.

The thing I've found painful before, with C projects, is getting Make to recognise that it needs to rebuild when a header has changed. There are various ways around this (makedepend etc) but they've all been quite painful to set up and not quite perfect.

That said, with modern machines that have NVMe storage and massively fast processors, a complete rebuild is seldom a big time cost


Or you can let the compiler generate the header-dependency Makefiles for you:

http://make.mad-scientist.net/papers/advanced-auto-dependenc...

You already tried that? I do not find it painful to set up.


Maybe not using exactly that method, but I am pretty sure I have tried using gcc for it. Will have a proper read of that later.

The syntax is a tad hairy though,and I'd want to adapt it - I tend to avoid compiling individual C files to objects these days, due to WHOPR optimisation.


If you don't care about incremental builds and want just a full rebuild, I would write a shell script instead of a Makefile.

That said, I think incremental builds are important for most use cases.


Oops, let me amend that. The partial order of make is still useful for parallel builds from scratch, even if you don't care about incremental builds. Like almost all other languages, shell forces a total order on execution.

On the other hand, computers are fast enough that serial full builds indeed work in some cases.


I'd argue that the make syntax and built-in features are a huge boon over starting from plain-old-shell regardless.


What are some examples of that? Shell can do basically everything make can. The syntax of both is ugly but I'll grant that make is more uniform.

And btw there is no way to use make without shell, but you can use shell without make.


I guess it probably comes down to preference, but I can take a list, write a one-line .c->.o transform, a one-line link target, add in a clean target etc etc faster with make.

Sure, I can write these as functions in bash, call once for each source file, check return codes etc etc, but I find expressing dependencies faster than writing anything like robust code in shell, and make deals with failed steps by stopping.


And as you rightly pointed out above - once you get the dependencies nailed down, you can parallel build trivially.


> 2. Make is _really_ hard to use to try to compose a large build system

Are Android and LEDE/OpenWrt big enough?

> 3. Let's be honest, nobody really wants to learn Makefile syntax.

That's probably very subjective, I find JavaScript, C++, PHP or Perl much worse :-)

Anyway, for the past years I'm cheating a lot and using CMake to get complex Makefiles almost for free.


> That's probably very subjective, I find JavaScript, C++, PHP or Perl much worse :-)

Oh definitely. I actually like Makefile's, but in my experience, more teams have a deep familiarity with some programming language, than with Makefile syntax. I haven't met very many people who have read the GNU make manual, and know all the idiosyncracies around Makefile syntax.


+1 for cmake. It works so well, in my latest C++ project the cmake file is more or less just a listing of source files and it does it job on windows, Mac and a few different Linux distros without any platform specific stuff. So well that I replaced the huge makefiles of the libraries I use with really small cmake files that just work and don't require modifications every 3 months because this specific distro causes problems or whatever.

It feels much more like in those IDEs where you just drag all the source files in and you're done.

I even prefer editing visual Studio project files by hand to many makefiles out there..


> Are Android and LEDE/OpenWrt big enough?

Doesn't mean that it's not hard. :) And if IIRC google are trying to move android build to blueprint because make is too complicated.


This lovely cough line brought to you by a lede/openwrt Makefile:

target/linux/ar71xx/image/Makefile: CMDLINE = $$(if $$(BOARDNAME),board=$$(BOARDNAME)) $$(if $$(MTDPARTS),mtdparts=$$(MTDPARTS)) $$(if $$(CONSOLE),console=$$(CONSOLE))


That's just a bit of cmd line building. Three clauses to build a string -

  If BOARDNAME is defined, add board=$(BOARDNAME) to the string
  If MTDPARTS is defined, add mtdparts=$(MTDPARTS) to the string
  If CONSOLE is defined, add console=$(CONSOLE) to the string
Pretty simple, if a little like the ternary operator in C.

You must have seen more complex clauses than that in shell scripts and all sorts of places.


The number of times I've forgotten that unzip restores timestamps and so make has decided to rebuild everything...

`unzip -D` is your friend.


ccache is your friend ;-)


Honestly, I've never actually written a makefile to compile a C or C++ program. In fact, I haven't written any C of my own since university.

My makefiles are usually just a way to record a data pipeline. Get these files, shove them through these scripts here and those programs there. Launch a web server to show the output.


I honestly have no idea why there's so much fandom towards Make in this thread, but for me there are a few absolutely devastating problems with Makefiles:

a) Mtimes-as-change-detection is fundamentally broken given the reality of networked file systems and... physics. (Minor problem, but extremely annoying to work around in practice.)

b) Nobody can actually really understand all the interdependencies between all the code in their system(s!), and yet Makefiles expect you to specify all of that explicitly?!? Yes, you could technically specify that -- and you'll want to -- but you won't, because you don't know and don't have the time.

c) Make support for builds that change the structure of the build during the build is abysmal. E.g. after "processing foo.xml we now have more files than we had before! What are you going to do?". Well, in Make it's some sort of custom solution with ".d" files and "gcc -M" or whatever. This is utterly broken in that it pushes all the complexity onto the user. So, yes, an elegant model, but it doesn't actually solve the problem. If you want to see a better solution see the "Shake" paper.


> Let's be honest, nobody really wants to learn Makefile syntax.

I haven't found that to be the case. Once I show people how simple it is they realise make is somewhat approachable, similar to OP's article.


Consider yourself lucky! If I had a penny for everytime I had to explain automatic variables...


You can explain how to use a man page (man make RET /\$+ RET).


> As a shameless plug, I built a tool similar to Make and redo […] https://github.com/ejholmes/walk

Your README’s example show `parse.h` as output from `Walkfile deps parse.o`, I think that is a mistake.

As for your build system (and comment), I have some questions:

1. How do you achieve using a deterministic hash as condition (and aren’t all hash functions deterministic)?

2. Why would you not be able to use mtime as a dependency? The only case I have run into is when the build depend on data from remote machines, but in that case, I think the proper solution is to have an initial “configure” step where you retrieve the data your build depend on and/or a build rule to update this data.

3. Does your build system execute the Walkfile for every node in the graph on each walk? Because that sounds like a quite noticeable overhead for larger projects.

4. Am I right in thinking that the primary advantage with your system, over make, is that a shell command is executed to obtain a target’s dependencies?


1. Yeah, hash functions are deterministic, but your input needs to be determinsitic across machines too. For example, on a unix system, you may want to conditionally build if any files have changed. To do that, you could generate a deterministic hash of the dependencies with something like `find . | sort | xargs sha1sum | sha1sum | cut -d ' ' -f 1`. Including mtime in that would break across machines.

2. Mainly because doesn't translate across machines; it only applies to your machine. If someone checks out the repo on their machine, mtime is different. As soon as you move a build system to CI, you need something better than mtime, like content adressable hashes, if you intend to cash targets.

3. It executes the Walkfile twice for each target in the graph; once to determine the targets deps, and once to execute the target. This definitely hasn't been a bottle neck for anything I've used walk(1) with so far.

4. Correct! But even more so, replace "shell script" with "executable". The Walkfile can be written in any language you want, as long as it has the executable bit set.


> your input needs to be determinsitic across machines too

This is one of those things that "distcc" and "ccache" have dealt with effectively - anyone trying to build big C++ projects would be well served by looking at those tools.


Thanks for the clarifications.

As for #1, what I do not understand is how a `Walkfile` allows me to use a content hash change to trigger a rebuild.

Your documentation says that a file list should be returned for `deps`, so how does the `Walkfile` communicate that e.g. `main.o` should be updated if `sha1sum main.c` is different than on last invocation?


Good question. It's not a concern of walk(1) itself, since it's impossible to generalize for every use case. walk(1) _always_ executes the target, but the target itself can decide whether it actually needs to do work. For a C project, you would just do something like this: https://github.com/ejholmes/walk/blob/master/test/111-compil....

That's a trivial example using mtime. You could replace it with a hash function if you wanted.


Very much agreed :| It's great when it's kept fairly simple (for both automating random stuff and building things! and it's installed everywhere! and has some cross-platform tools!), but it quickly turns into a nest of vicious footguns unless you're extraordinarily careful.


> If you try to break it up into multiple Makefiles, you lose all of the benefits of a single connected graph

Only if each Makefile is treated as a separate rule set processed with a separate make invocation.

> Read the article about why recursive make is harmful

That (now) venerable, old paper in fact shows how to break up into multiple makefiles (called "module.mk" in its examples) which are included into one graph.

(It's possible to actually have this file be called Makefile. Not only that, but it's possible to have it so you can actually type "make" in any directory of the project where there is a Makefile, and it will correctly load and execute the rules relative to the root of the tree.)


> Let's be honest, nobody really wants to learn Makefile syntax.

Make's syntax is quite simple. It's a bunch of variables one has to memorise, and man page is a command away. And any decent editor would know to insert literal tabs.

> Make is _really_ hard to use to try to compose a large build system from small re-usable steps.

I have no experience myself, but Linux's build system is a bunch of homebrew makefiles, all the BSDs and their ports trees build with bmake. These are enough positive examples for me to think that Make is good for big systems.


> Make's syntax is quite simple.

I'll just leave this here:

https://github.com/sapcc/limes/blob/62e07b430e2019a6c1891443... (then used in line 42)


That's a variable assignment. What's complex with that?


The fact that I even need to use variables because the language does not have proper string literals.



Main downside, as with many tools much better than make:

>walk is currently not considered stable, and there may be breaking changes before the 1.0 release.

:)

It also needs go, which is pretty non-lightweight dependency on windows, I guess. (And personally, I don't find these dir-hierarchy connection and bash's "case $target in a) ... b) ..." any friendlier at all, though make has some quirks with variable interpolation.)

I'm not to argue to use make for very complex situations, but usual src->intermediate->executable and generate this from that of any size and count is a perfect task for make. Makefiles are unsuitable not for big projects, but for complicated build systems, where it's not enough to just do them steps in correct order. If your build system is complicated, it should be worth it at least. Otherwise use make.

Fix: typos/grammar


walk is built with Go, which also means it's a single statically linked binary with no dependencies if you download the pre-compiled binary.

If you don't like bash syntax, you can use Python, or Ruby, or Perl, etc. Any executable can serve as a Walkfile.

It's a completely fair point that make is installed on pretty much every machine by default, which is why we won't see it going away anytime soon (nor should it, it's still good).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: