Firmware (Marlin) build and release process for V1 machines

anttix · December 15, 2019, 2:09am

I think we agree here that this will be cleaner and as you mentioned by the end of the day the scripts can generate as many artifact types as needed. I’ll play around with actions a bit to see how we can do this. In the meantime, maybe someone can take a stab at making buildroot/bin/opt_* smarter so that they can indicate a failure to find anything to set/enable in their return code.

jeffeb3 · December 15, 2019, 2:11am

I can look at this, or if anyone else wants to, that’s fine. It would be great if we could get that upstream too, so we don’t have diffs in there forever.

anttix · December 15, 2019, 2:17am

Agreed and that’s why it’s probably best done with SED so that it does not raise any red flags around compatibility and whatnot.

jeffeb3 · December 15, 2019, 6:14am

I have this mostly working. Several of the tests are failing. I am not 100% sure my sed is exactly like theirs, so I need to figure that out.

A task for tomorrow.

https://travis-ci.org/jeffeb3/Marlin/builds/625212804

jeffeb3 · December 15, 2019, 4:42pm

PR:

I shouldn’t be surprised, but it found a bunch of errors, and in an effort to get this in before 2.1.x, I have spent most of my time trying to fix them.

vicious1 · December 15, 2019, 5:42pm

Sorry to interrupt, you guys are doing some grown up work here. Fascinating to me, I will watch quietly trying not to touch anything or distract you all while your brains are on overdrive.

Trying to wrap my head around this PR, is this a test for the travis CI or is this part of the compile test to point out obvious configuration errors?

jeffeb3 · December 15, 2019, 5:48pm

They have tests that start from a fresh config and run commands like:

opt_enable EEPROM_SETTINGS

Then they build that to make sure it compiles. Their travis setuo does that.

If they instead typed EEEEEEPROM_SETTINGS, then it would test nothing.

My fix will stop the test and print a message if it fails.

The reason we care is, we can make a script that starts with a fresh checkout, and applies the configuration for MPCNC_Rambo_blah, tests it, and creates a .zip or .hex or .bin. But if you mistype something in that setup, you wouldn’t know it. With this fix, if you type eeeeprom, it will fail right away and it’s an easy fix.

vicious1 · December 15, 2019, 5:58pm

Thank you for explaining that. That is pretty amazing.

anttix · December 15, 2019, 7:41pm

Yeah, I suspected this may happen. On the bright side, silently broken tests should be an elegant proof that this change is valuable to the upstream. This is truly awesome, thank you for doing the hard work.

Fat-fingering is actually lesser of the evils. The more nefarious problem with stuff that fails silently like this is that the configs are not standing still: Settings get renamed all the time. Without a way to detect this, config generation code will “rot” silently and result in hard to detect and find issues. E.g. if someone renames the junction deviation setting in the future, it’s pretty darn hard to realize the setting is off unless you specifically look for it.

anttix · December 15, 2019, 7:58pm

Status update from my end.

I stayed up until 2am playing with GitHub workflows. Seriously addictive stuff

Well, joking aside, here’s the progress so far.

They do work as advertised and are surprisingly fast. My jobs were launched in less than a minute after push.
Automated git pushes back to the same branch or to another branch work as expected
Zipping stuff up works
Publishing a release does not work yet, the action I attempted to use is broken for binary files e.g. text files publish fine, but Zip-s end up without content. I’ll probably have to swap it out. The other one I found is really hacky to install as it requires harvesting some API URLs. This sounds brittle. I need to keep digging to find a better one or contribute some code to fix this.
Automatic building with platformio doesn’t work yet. I didn’t find any pre-canned actions to do this. There is also a more philosophical question of if this should be a job of Travis CI because this is what Marlin uses. GitHub workflows are advertised as suitable for CI though and using / learning one system instead of two is a plus there. Bottom line: This needs some more research to determine the best approach.
Automatic synchronization to upstream with pull app works, but I’m not totally happy with the setup yet. The main gripe I have is that it creates Pull requests and doesn’t seem to keep the history as clean as I would like. Given actions support timed execution, it may be better to use some action. On the other hand, I was not able to get actions to trigger in response to commits generated from actions themselves (probably a deliberate safeguard to prevent loops). Some more research needed here.

Random tidbits (very technical):

Actions can be published as a separate git repository, or even stored in the same repository so it’s easy to use private actions if needed. I’d like to avoid writing action code though as someone would have to maintain it.
The implementation can either be JavaScript that uses available API-s or it can be a docker container. Zip action for instance is a docker with zip package installed into it
There is some kind of a caching layer that in theory should speed up builds by caching some files. May be useful to cache platformio installation if we go that route.

jeffeb3 · December 15, 2019, 9:09pm

I was thinking that it might work better to keep the config files and all the CI in a separate repo from Marlin. Is that what you’re thinking? Sort of how my pi image is separate from cnc.js, or octoprint. I think you’ve started down this path, but I haven’t browsed around your repo except following your direct links.

anttix · December 15, 2019, 10:24pm

The “pull” app requires the project to be in the same “fork network” to work. That’s not a big deal though given one can create “orphan” branches that pretty much start from zero.

Github actions on the other hand want to be part of the same branch they act upon. At least I wasn’t able to get them to trigger if they were not. I need to discover all of these details as these will guide how branches/repositories can be set up.

Right now it looks like two options may be possible:

V1 configs/scripts on top of upstream code, rebased to update (I suspect one branch per major release + emergents e.g. 2.0.x bugfix-2.0.x dev-2.1.x
V1 configs/scripts in separate branches, artifact branches generated by combining this with upstream.

#1 has fewer branches but limited amount of V1 specific history as it’s cumbersome to carry a ton of commits over with rebases
#2 has more branches and more complex scripting but can have separate history.

BTW: Building on top of Scott’s attempts, I got platformio to build in an action: https://t.co/T68GqhiY3n?amp=1 not sure how efficient it will be though.

jeffeb3 · December 15, 2019, 10:56pm

I’m still worried about having Ryan rebase. He can learn the basics, but rebasing can get ugly quick if there are any mistakes or conflicts.

Constantly rebasing and squashing our changes, along with an occasional tag on stable versions would probably end up pretty clean. Marlin doesn’t keep the cleanest history either, so I wouldn’t be surprised if we had to cherry-pick every once in a while. bugfix-2.0 isn’t a child of 2.0.x, for example, so rebasing from our bugfix-2.0 to our 2.0.x would include rebasing their commits if we aren’t careful.

This is a hack, but we could also just trigger builds from a set number of branches, and trigger it on a schedule, like nightly, right? Instead of pull? I haven’t looked at pull yet, but just brainstorming solutions. I’ve used a system before where it couldn’t tell when upstream changes happened, so we forced CI on new downstream PRs, and also nightly.

anttix · December 15, 2019, 11:52pm

The fork network is not an issue one way or the other. It’s actually beneficial to be in the fork network as this allows to open PR-s against Marlin upstream from the same repository and V1 firmware is a Marlin fork (albeit a shallow one) so it’s a right concept.

Rebasing and merging can get equally ugly in my experience. If there is a need to carry changes on top of Marlin, then these should be maintained via rebasing because a long-term goal is to get these merged to upstream. The key is to try to separate out stuff we don’t want to push upstream. Different file/directory names will likely help to avoid any conflicts there but a separate branch has stronger separation and history.

anttix · December 16, 2019, 12:09am

Now for option #2 (separate branch for V1 stuff), I was thinking about the following branches:

v1-machines
scripts / example configs that are not going to be pushed upstream such as the ones that generate individual machine configs
bugfix-2.0.x
Marlin + patches that will eventually live there (rebased)
v1-bugfix-2.0.x
bugfix-2.0.x + all V1 configs in config/examples (generated by whatever means)

I expect v1-machines branch to be shareable between bugfix-2.0.x and 2.0.x but not with dev-2.1.x as it may diverge in tooling that scripts will use. If this happens I’d create v1-machines-2.0.x once 2.1 diverges enough to need it.

Workflows would then be something like this:

MarlinFirmware/bugfix-2.0.x -> allted/bugfix-2.0.x (rebase, automatic sync)
allted/bugfix-2.0.x --> [ generate configs ] -> allted/v1-bugfix-2.0.x allted/v1-machines -+
allted/v1-bugfix-2.0.x -> [ build ] -> release:latest
allted/tag-414 -> [ build ] -> release:414

So basically to do a release, it should be a simple matter of tagging suitable commit on v1-2.0.x or v1-bugfix-2.0.x branch and pushing it to a tag.

Now this is still aspirational as I’m not sure if this will work like this especially automated rebase from upstream. It may be that the end-result will have more branches to accommodate the limitations of the system.

jeffeb3 · December 21, 2019, 12:53pm

My PR was merged. . I’m not sure what to try next. Is there a repo based on 2.0 and it’s children where we can review the scripts and configs?

I need to get an EXTRUDERS=0 build to try out octoprint and cnc.js to see if they will freak out.

vicious1 · December 21, 2019, 4:44pm

Well since Scott said there was a 2.0.1 right around the corner I was going to wait. In the end that would would only be a tiny update (rebase?) from the bugfix anyway. I can start a set of fresh firmware for everything make the changes one at a time, fresh and test along the way. Any board in particular you want next/first?

anttix · December 21, 2019, 6:09pm

Hard to believe it’s been 6 days already :S I’m 99% done with the setup that includes updated configs to 2.0.x. Gimme a day to get it to a state where I can share it and let’s review.

anttix · December 21, 2019, 6:11pm

Octoprint did freak out but I made a change to Marlin that should calm it down. That was merged a long time ago.

anttix · December 21, 2019, 6:17pm

Hey @vicious1 what board needed 250ns ST7920_DELAY_2/3 ? The reason I ask is that these are board specific and in general should live in Marlin but if we have to set these we should not set them for all of the boards, only for the ones that need them.