ToDo for jenkins.debian.net =========================== :Author: Holger Levsen :Authorinitials: holger :EMail: holger@layer-acht.org :Status: working, in progress :lang: en :Doctype: article :Licence: GPLv2 == About jenkins.debian.net See link:https://jenkins.debian.net/userContent/about.html["about jenkins.debian.net"] for a general description of the setup. Below is the current TODO list, which is long and probably incomplete too. The links:https://jenkins.debian.net/userContent/contributing.html[the preferred form of contributions] are patches via pull requests. == Fix user submitted bugs * There are link:https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=jenkins;users=qa.debian.org%40packages.debian.org["bugs filed against the pseudopackage 'qa.debian.org' with usertag 'jenkins'"] in the BTS which would be nice to be fixed rather sooner than later, as some people actually care. == General ToDo * replace amd64 in scripts with $HOSTARCH * extend /etc/rc.local to do cleanup of lockfiles * explain in README how to write jobs, eg which pathes are on tmpfs ** EXECUTOR_NUMBER for X * fix apache ssl configuration as hinted by eg https://sslcheck.globalsign.com/en/sslcheck?host=jenkins.debian.net#78.137.96.196 * run all bash scripts with set -u and set -o pipefail: http://redsymbol.net/articles/unofficial-bash-strict-mode/ * teach bin/chroot-*.sh and bin/d-i_build.sh how to nicely deal with network problems… (as both reproducible_build.sh and schroot-create.sh do) === ToDo for improving disk space * make live-build jobs work again or remove them * make sure the live-build jobs clean up /srv/live-build/results/*iso once they are done. thats 8gb wasted. === TODO for testing stretch Most jobs have been converted, a few are left to do: * add g-i tests for stretch === To be done once jenkins.d.n runs jessie * replace with bin/setsid.py workaround with setsid from the util-linux package from jessie * bin/g-i-installation: use lvcreate without --virtualsize * check if the sudo workaround in bin/g-i-installation is still needed: 'guestmount -o uid=$(id -u) -o gid=$(id -g)' would be nicer, but it doesnt work: as root, the files seem to belong to jenkins, but as jenkins they cannot be accessed. === To be done once jenkins.d.n runs stretch * install botch from stretch and remove botch from the reproducible-unstable schroot ** botch now depends on a newer dose3, which depends on the ocaml from stretch. ocaml cannot be sensibly backported, so thats why this will have to wait for stretch === move this setup to jenkins.d.o The plan is to run a jenkins.d.o host, which is maintained by DSA, but we are maintaining jenkins on it (so we can install any plugins we like etc). then we also setup several jenkins nodes, in the long term probably/maybe also maintained by DSA, on which we can use sudo as we need it. ==== next steps for jenkins.d.o migration ** the machine jerea.debian.org is setup, please go to https://jenkins.debian.org ** create jenkins-job-builder backport from sid to jessie and install that (we + DSA) *** work in progress. #799277 is the blocker to upload to bpo, but we can just use it with python2 *** on jenkins.d.n we now run j-j-b 1.3.0-5 directly backported from sid, just with the python3 stuff disabled… ** install all the plugins (we) ** add all the nodes as nodes to jenkins.d.o (we) *** use static IP for the nodes (h01ger) ** disable job execution on jenkins.d.net(!) (we) ** deploy this configuration on jenkins.d.o…(!) (we) *** make update_jdn.sh warn if things are missing on .debian.org systems *** as we dont want irc nor mail notifications for this during the migration, we should disable those with an easily revertable commit before actual deployment *** then rename jenkins.debian.net to profitbricks-build0-amd64 - and switch all the jobs which used to run on the master node on that node, which already has the right sudoers, usercontent/reproducible/ and reproducible.db *** some authorized_keys will also need to be adopted for the change of IP address from jenkins.d.n to jenkins.d.o *** redirect jenkins.debian.net to jenkins.debian.org - reproducible.debian.net will stay where it is. * party! ===== leftover notes for jenkins.d.o migration * livescreenshot plugin (we use a patched version) is ok: ** jenkins maintenance probably is best done by jenkins users (as opposed to DSA) so it's up to us what plugins we install * chroot jobs should use real schroot sessions, and not just use schroot as poor chroot(8) replacement. some links: ** https://anonscm.debian.org/cgit/mirror/dsa-puppet.git/tree/modules/schroot ** https://anonscm.debian.org/cgit/mirror/dsa-puppet.git/tree/modules/porterbox/files/dd-schroot-cmd ** https://gitweb.torproject.org/project/jenkins/tools.git/tree/slaves/linux/build-wrapper ==== proper backup * postponed til we run on .debian.org * this needs to be backed up: * '/var/lib/jenkins/jobs' (the results - the configs are in .git) * '/var/lib/munin' * '/var/log' * '/root/' (contains etckeeper.git) * '/var/lib/jenkins/reproducible.db' (is backed up manually) * '/srv/jenkins.debian.net-scm-sync.git' (is backed up manually) * '/var/lib/jenkins/plugins/*.jpi' (can be derived from jdn-scm-sync.git) * '/srv/jenkins.debian.net-scm-sync.git' * '/etc/.git' and '/etc' === To be done once bugs are fixed * link:https://bugs.debian.org/767260[#767260] workaround in bin/d-i_build.sh (console-setup doesn't support parallel build) * link:https://bugs.debian.org/767032[#767032] manual fix in etc/munin/plugins/munin_stats * link:https://bugs.debian.org/767100[#767100] work in progress in etc/munin/plugins/cpu * link:https://bugs.debian.org/767018[#767018] work in progress in etc/munin/plugins/iostat_ios * link:https://bugs.debian.org/774685[#774685] workaround in bin/reproducible_create_meta_pkg_sets.sh === jenkins-job-builder related * investigate whether its possible nowadays to let it delete jobs which were removed.. nope. though it is possible to compare the list of all jobs which should be generated with those which are there and delete those which should not. * yaml should be refactored, lots of duplication in there. this seems to be helpful: http://en.wikipedia.org/wiki/YAML#References (pyyaml which jenkins-job-builder uses supports them) === debugging job runs should be made easy ---- < h01ger> | i think the jenkins-debug-job script should be a python script < h01ger> | and j-j-b or another yaml parser can supply job configuration knowledge to that script < h01ger> | \o/ < h01ger> | and that python script can also first determine whether the environment is as needed for the job, and if not, complain verbosely+helpfully and exit ---- == Improve existing tests === tests.reproducible-builds.org * move /* to /debian/* * rename r.d.n to tests.reproducible-builds.org ** use redirects to keep old URLs working * setup pb9 and move rebootstrap jobs over there * configure pb4 running 398 days in future and adapt coreboot|openwrt|netbsd|archlinux|fedora jobs to use it * common code base (and sql database layouts - not databases) for creating the webpages for debian|coreboot|openwrt|netbsd|archlinux|fedora * common navigation (using left side navi menu like on the debian pkg pages), but different themes for dashboards, indexes and package pages * improve the intro texts: ** explain status in plain english on each dashboard, plus add an "executive summary about reproducible builds in the free software world" ** document in the non-debian pages, when we don't have a clear idea yet, how to record+reproduce the build environment and that this is essential for reproducible builds too. * install cbfstool in diffoscope schroots: (useful for openwrt+coreboot) ** 'git clone --recursive http://review.coreboot.org/p/coreboot.git ; cd coreboot/util/cbfstool ; make ; cp cbfstool $TARGET/usr/local/bin/' === Debian reproducible builds * make reproducible_build.sh rock solid again and get rid off "set -x # # to debug diffoscoppe/schroot problems" ** add check if package to be build has been blacklisted since scheduling and abort ** on SIGTERM, also cleanup on remote build nodes there! (via ssh &) ** fix: "DIFFOSCOPE='E: Failed to change to directory /tmp: Permission denied' - maybe by making sure the cause is gone… https://jenkins.debian.net/job/reproducible_builder_amd64_14/909/ is an example for that ** reenable disorderfs setup, check that it *always* unmounts + cleans up nicely * higher prio: ** dashboard: number of cores used is wrong for amd64… ** check that build nodes have different amount of cores, so we dont need to run the 2nd build with NUM_CPU-1 ** pkg pages *** new table in pkg/test history page: schedule - if that package is currently scheduled *** add link to pkg set(s) if pkg is member of some ** make maintenance job detect and reschedule logs with: 'E: 10mount: error: Directory '.*' does not exist' ** make maintenance job detect and reschedule logs with: '^Bus Error$' ** scheduler should automatically schedule 404 packages ** use schroot tarballs (gzipped), moves are atomic then ** notes related: *** #786396: classify issue by "toolchain" or "package" fix needed: show bugs which block a bug *** new page with annoted packages without categorized issues (and probably without bugs as only note content too, else there are too many) *** new page with packages that have notes with comments (which are often useful / contain solutions / low-hanging fruits for newcomers) *** new page with notes that doesnt make sense: a.) packages which are reproducible but should not, packages that build but shouldn't, etc. *** new page with packages which are reproducible on one arch and unreproducible on another arch (in the same suite, so unstable only atm) *** new page with packages which ftbfs on one arch and build fine on another arch (in the same suite, so unstable only atm) *** new page with packages which ftbfs in testing but build fine on sid ** new pages: r.d.n/$maintainer-email redirecting to r.d.n/maintainers/unstable/${maintainer-email}.html, showing the unreproducible packages for that address. and a sunny "yay, thank you"-summary for those with only reproducible packages. ** improve ftbfs page: list packages without bugs and notes first ** mattia: .py scripts: UDD or any db connection errors should either be retried or cause an abort (not failure!) of the job ** repo-comparison: check for binaries without source ** bin/_html_indexes.py: bugs = get_bugs() # this variable should not be global, else merely importing _html_indexes always queries UDD ** re-enable linux-2.6 usage once 806911 has been fixed * lesser prio ** new page: packages which are orphaned but have a reproducible usertagged patch ** remove the rescheduling reason from the DB, that's really not needed ** scheduler: check if there have been more than X failures or depwait in the last Y hours and if so unschedule all packages, disable scheduling and send a mail informing us. ** check that cleanup of old diffscope schroots on armhf+amd64 nodes works.... ** check that diffoscope schroot have /usr/bin/ar installed, else fail the job ** check that /srv/workspace/pbuilder/ is cleaned up properly ** rewrite bin/schroot-create.sh from scratch, with little sudo. ** right align all numbers in table in the dashboard ** pkg sets related: *** link to that pkg set on another arch *** fix essential set: currently it only has the ones explicitly marked Essential:yes; they and their dependencies make up the full "essential closure set" (sometimes also called pseudo-essential) *** replace bin/reproducible_installed_on_debian.org with a proper data provider from DSA, eg https://anonscm.debian.org/cgit/mirror/debian.org.git/plain/debian/control *** reproducible_create_meta_pkg_sets uses schroot created by dpkg_setup_schroot_jessie job (outside of reproducible job space...) ** "fork" etc/schroot/default into etc/schroot/reproducible ** a reproducible_log_grep_by_sql.(py|sh) would be nice, to only grep in packages with a certain status (build in the last X days) ** adopt usertag script from pkg-apparmor to notify us about new usertagged bugs automatically ** database issues *** stats_build table should have package ids, not just src+suite+arch as primary key *** move "untested" field in stats table too? (as in csv output...) ** blacklist script should tell if a package was already blacklisted. also proper options should be used... ** maintenance.sh: delete the history pages once a page has been removed from all suites+archs ** new page showing arch all packages which are cross-reproducible, and those which are not ** debbindiff2diffoscope rename: do s#dbd#ds#g and s#DBD#DS#g and rename dbd directories? ** diffoscope needs to be run on the target arch... (or rather: run on a 64bit architecture for 64bit architectures and on 32bit for 32 bit archs), this should probably be doable with a simple i386 chroot on the host (so using qemu-static to run it on armhf should not be needed, probably.) * support for arbitrary (to be implemented) Debian-PPAs and external repos, by just giving a source URL * once stabilized notification emails should go through the package tracker. The 'build' keyword seems to be the better fit for this. To do so just send the emails to dispatch@tracker.debian.org, setting "X-Distro-Tracker-Package: foo" and "X-Distro-Tracker-Keyword: build". This way people wanting to subscribe to our notification don't need to ask us and can do that by themselves. * missing tests: ** different cpu type (WIP) ** prebuilder does (user) group variation like this: https://anonscm.debian.org/cgit/reproducible/misc.git/tree/prebuilder/pbuilderhooks/A02_user ** variation of $TERM and $COLUMN (and maybe $LINES), unset in the first run, set to "linux" and "77" (and maybe "42") in the 2nd run. maybe vary $SHELL too. *** actually TERM is set to "linux" by default already, COLUMN is unset ** variation of users shell (bash + zsh|dash?) ==== reproducible Debian armhf * make systems send mail, use port 465 ==== reproducible Debian installation * see https://wiki.debian.org/ReproducibleInstalls * add the test (something weekly or so) ==== reproducible coreboot * add more variations: domain+hostname, uid+gid, USER, UTS namespace * build the docs? * also build with payloads. x86 use seabios as default, arm boards dont have a default. grub is another payload. and these: bayou coreinfo external filo libpayload nvramcui - and: ** CONFIG_PAYLOAD_NONE=y ** CONFIG_PAYLOAD_ELF is not set ** CONFIG_PAYLOAD_LINUX is not set ** CONFIG_PAYLOAD_SEABIOS is not set ** CONFIG_PAYLOAD_FILO is not set ** CONFIG_PAYLOAD_GRUB2 is not set ** CONFIG_PAYLOAD_TIANOCORE is not set * libreboot ships images, verify those? * explain status in plain english * use disorderfs for 2nd build? ==== reproducible OpenWrt * add credit for logo/artwork * build more archs (http://downloads.openwrt.org/chaos_calmer/15.05-rc1/ lists many to choose from) * incorporate popular third-party ("external feeds") packages? * explain status in plain english * use disorderfs for 2nd build? ==== reproducible NetBSD * explain status in plain english ** MKREPRO is set to "yes" * use disorderfs for 2nd build? ==== reproducible FreeBSD * useful improvements: ** investigate how to use tmpfs on freebsd and build there ** find a way to be informed about updates and keep it updated ** modify PATH, uid, gid and USER too and host+domainname as well. The VM is only used for this, so we could change the host+domainname temporaily between builds too. ** add freebsd vm as node to jenkins and run the script directly there, saves lot of ssh hassle ** run diffoscope nativly * random notes, to be moved to README ** we build freebsd 10.1 (=released) atm ** we build with sudo too *** rather not change /usr/obj to be '~jenkins/obj' and build with WITH_INSTALL_AS_USER. also not build in /usr/src. if so, we need to define some variable so we can do so.... but we need a stable path anyway, so whats the point. *** maybe build as user in /usr/src... * first build world, later build ports (pkg info...) * document how the freebsd build VM was set up: ** base 10.1 install following https://www.urbas.eu/freebsd-10-and-profitbricks/ ** modified files: *** /etc/rc.conf *** /etc/resolv.conf *** /boot/loader.conf.local ** pkg install screen git vim sudo denyhosts munin-node ** adduser holger ** adduser jenkins (with bash as default shell) ** mkdir -p /srv/reproducible-results ** chown -R jenkins:jenkins /srv/ * system maintenance ** upgraded the VM to FreeBSD 10.2 *** done with: freebsd-update upgrade -r 10.2 ==== reproducible Fedora * make sure the pages meet https://fedoraproject.org/wiki/Design/Requirements and ask the web design team for help via filing a ticket as described there * '/var/cache/mock/fedora-23-x86_64/' has three subdirs we need to handle (put on tmpfs, monitor size, clean sometimes): ccache, root_cache and yum_cache * '/var/lib/mock' should be put on /srv/workspace aka tmpfs * dont hardcode 23 in reproducible_setup_mock.sh and …build_rpm.sh * setup script: ** mock --clean just uninstalls the chroot but it'll still be rebuilt next time using cache. you can delete the caches from /var/cache/mock/ or touch the mock config ** is /etc/yum/repos.d/fedora.repo really needed? ** hosts/pb-build3/etc/yum/repos.d/* is really not sooo good but works… * build script ** cleanup mock cache between two builds: --scrub=all might be too much, but whats sensible (or is it --scrub=all?)? ** no variations introduced yet: *** use '-j$NUM_CPU' and 'NEW_NUM_CPU=$(echo $NUM_CPU-1|bc)' *** modify TZ, LANG, LC_ALL, umask * other bits: ** use modified rpmbuild package from dhiru ** verify gpg signatures (via /etc/mock/) ** one day we will want to schedule all 17k source packages in fedora… * build rawhide too (once fedora-23 builds nicely), releasever=rawhide * more notes: ** https://fedoraproject.org/wiki/Using_Mock_to_test_package_builds ** http://miroslav.suchy.cz/blog/archives/2015/05/28/increase_mock_performance_-_build_packages_in_memory/index.html ** manually create a fedora chroot using rpm, wget + yum: http://geek.co.il/2010/03/14/how-to-build-a-chroot-jail-environment-for-centos ==== reproducible Arch Linux * describe archlinux setup…! * move /var/lib/schroot to /srv/workspace aka tmpfs * setup_archlinux_schroot job: ** needs to be made idempotent ** needs to download bootstrap.tar.gz sig and verify ** once this has been done, run it more often than once a year * maintenance job: ** check for archlinux schroot sessions which should not be there and delete them. complain if that fails. * arch build.sh: ** introduce more variations: USER ** confirm the others are really working ** on SIGTERM, also ssh to remote host and cleanup there! (via ssh &) * put results in a db ** graph results * idea: when a package has been updated reschedule reverse build depends too ** (for that we need to detect updated packages first) ---- notes on source and binary versions: tar-1.28.tar.xz (source) -> tar-1.28-1-x86_64.pkg.tar.xz (binary) $PKG/PKGBUILD has: pkgname=tar pkgver=1.28 # sometimes this is calculated and not greppable, so PKGBUILD has to be sourced (in a safe environment…) pkgrel=1 ---- ==== reproducible fdroid * https://f-droid.org/wiki/page/Build_Server_Setup ==== reproducible... * openembedded.org! * Gentoo? === qa.debian.org* * udd-versionskew: explain jobs in README * udd-versionskew: also provide arch-relative version numbers in output too === d-i_manual* * d-i_check_jobs.sh: check for removed manuals (but with existing jobs) missing * svn:trunk/manual/po triggers the full build, should trigger language specific builds. * svn:trunk/manual is all thats needed, not whole svn:trunk === d-i_build* * d-i_check_jobs.sh: check for removed package (but with existing jobs) missing * build packages using jenkins-debian-glue and not with the custom scripts used today? * run scripts/digress/ ? * bubulle wrote: "Another interesting target would be d-i builds *including non uploaded packages* (something like "d-i from git repositories" images). That would in some way require to create a quite specific image, with all udebs (while netboot only has udebs needed before one gets a working network setup). === chroot-installation_* * use schroot for chroot-installation, stop using plain chroot everywhere * add alternative tests with aptitude and possible apt * split etc/schroot/default * inform debian-devel@l.d.o or -qa@? * warn about transitional packages installed (on non-upgrades only) * install all the tasks "instead", thats rather easy nowadays as all task packages are called "task*". ** make sure this includes blends === g-i-installation_* Development of these tests has stopped. In future the 'lvc*' tests (see below) should replace them. These small changes are probably still worth doing anyway: * g-i: replace '--' with '---' as param delimiter. see #776763 / 5df5b95908 in d-e-c * download .isos once in central place ** /var/lib/jenkins/jobs/g-i-installation_*/workspace/*iso needs 53GB currently, it could be 30 less * g-i_presentation: use preseeding files on jenkins.d.n and not hands.com * turn job-cfg/g-i.yaml into .yaml.py The following ideas should really only be implemented for the new 'lvc*' tests.... (but are kept here for now) * pick LANG from predefined list at random - if last build was not successful or unstable fall back to English ** these jobs would not need to do an install, just booting them in rescue mode is probably enough * for edu mainservers running as servers for workstations etc: "d-i partman-auto/choose_recipe select atomic" to be able to use smaller disk images ** same usecase: -monitor none -nographic -serial stdio === torbrowser-launcher_* * fix "schroot session cleanup loop" in _common.sh to ignore other schroots * test tbl in German * test tbl on i386 * test alpha releases ** '~/.config/torbrowser/settings' file and edit the latest_version setting ** get version from '~/.cache/torbrowser/download/RecommendedTBBVersions' ** (warning: on update checks these files are written again…) * fix broken screenshot while job is running via apache redirect * once tbl is removed from experimental, make sure the job does nothing but detect that and exits quickly+successfully. * notifications should go somewhere public, after a while of testing. * debug why iceweasel is needed to be installed… and ca-certificates too. * run this in qemu and enable apparmor too? -> create new tests for apparmor first :) ** extend setup_schroot.sh to also setup virtual harddrives, see http://diogogomes.com/2012/07/13/debootstrap-kvm-image/ ** install linux, grub and copy the testscript and ssh keys on the the fs ** configure apparmor ** boot qemu ** ssh into the vm and run the script as usal * probably not: test with python-pygame installed. * probably not: test updates - add _and_upgrade to job names) - or maybe not, as tbl doesnt do this anymore… ** touch -d "$(date -u -d '25 hours ago' '+%Y-%m-%d %H:%M')" $FILE ** repeat test… == Further ideas... === lvc, work in progress, just started * upgrade lvc configuration to test stretch * put this on debian isos too: config/chroot_local-includes/lib/live/config/9999-autotest * re-read the docs! ** http://live.debian.net/manual/stable/html/live-manual.en.html#321 * generate feature files from templates? to cope with sub-products? -> no. detect desktop type and set variables accordingly -> simpler: pass an environment variable with the type * get iso * tables for looping through features: see tails/iuk.git/features/download_target_file/Download_Target_File.feature * to debug cucumber: --verbose --backtrace --expand * drop / remove * can probably go: dhcp.rb firewall_leaks.rb dhcp.feature firewall_leaks.feature * more occurances of "the computer boots Tails" * @source (only keep product tests) * disabled stuff in common_steps.rb ** #if @vm.execute("service tor status").success? * "I set sudo password" not needed for debianlive nor debian(edu): ** #@screen.wait("TailsGreeterAdminPassword.png", 20) * $misc_files_dir needed? * def sort_isos_by_creation_date Dir.glob("#{Dir.pwd}/*.iso").sort_by {|f| tails_iso_creation_date(f)} -> useless for us, purpose is to automatically select the latest iso if none is given * search case-in-sensitive for tails+tor+amnesia * put in update_jdn.sh: ---- addgroup tcpdump dpkg-statoverride --update --add root tcpdump 754 /usr/sbin/tcpdump setcap CAP_NET_RAW+eip /usr/sbin/tcpdump adduser $USER tcpdump adduser $USER libvirt adduser $USER libvirt-qemu ---- === rebuild sid completly on demand * nthykier wants to be able to rebuild all of sid to test how changes to eg lintian, debhelper, cdbs, gcc affect the archive: * h01ger> | nthykier: so a.) rebuild everything from sid plus custom repo. b.) option to only rebuild a subset, like all rdepends or all packages build-depending on something * h01ger> | and c.) only build once, not continously and d.) enable more cores+ram on demand to build faster === Test them all * build packages from all team repos on alioth with jenkins-debian-glue on team request (eg, via a .txt file in a git.repo) for specific branches (which shall also be automated, eg. to be able to only have squeeze+sid branches build, but not all other branches.) == Debian Packaging related This setup should come as a Debian source package... * /usr/sbin/jenkins.debian.net-setup needs to be written * what update-j.d.n.sh does, needs to be put elsewhere... * debian/copyright is incorrect about some licenses: ** the profitbricks+debian+jenkins logos ** the preseeding files ** ./feature/ is gpl3 // vim: set filetype=asciidoc: