ToDo for jenkins.debian.net
===========================
:Author: Holger Levsen
:Authorinitials: holger
:EMail: holger@layer-acht.org
:Status: working, in progress
:lang: en
:Doctype: article
:Licence: GPLv2
== About jenkins.debian.net
See link:https://jenkins.debian.net/userContent/about.html["about jenkins.debian.net"] for a general description of the setup. Below is the current TODO list, which is long and probably incomplete too. The links:https://jenkins.debian.net/userContent/contributing.html[the preferred form of contributions] are patches via pull requests.
== Fix user submitted bugs
* There are link:https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=jenkins;users=qa.debian.org%40packages.debian.org["bugs filed against the pseudopackage 'qa.debian.org' with usertag 'jenkins'"] in the BTS which would be nice to be fixed soon, as some people actually care.
== General ToDo
* replace amd64 in scripts with $HOSTARCH
* extend /etc/rc.local to do cleanup of lockfiles
* explain in README how to write jobs, eg which pathes are on tmpfs
* fix apache ssl configuration as hinted by eg https://sslcheck.globalsign.com/en/sslcheck?host=jenkins.debian.net#78.137.96.196
=== proper backup
* gpg encrypted to some keys
* run on alioth or paradis
* '/var/lib/jenkins/jobs' (the results - the configs are in .git)
* '/var/lib/munin'
* '/var/log'
* '/root/' (contains etckeeper.git)
* '/var/lib/jenkins/reproducible.db' (is backed up manually)
* '/srv/jenkins.debian.net-scm-sync.git' (is backed up manually)
* '/var/lib/jenkins/plugins/*.jpi' (can be derived from jdn-scm-sync.git)
* '/srv/jenkins.debian.net-scm-sync.git'
* '/etc/.git' and '/etc'
* postpone til we run on .debian.org?
=== TODO for testing stretch
Most jobs have been converted, a few are left to do:
* add g-i tests for stretch
* add stretch live-builds
* do lvc for stretch too
* mention stretch in README where appropriate
=== move this setup to jenkins.d.o
The plan is to run a jenkins.d.o host, which is maintained by DSA, but we are maintaining jenkins on it (so we can install any plugins we like etc). then we also setup several jenkins slaves, probably/maybe also maintained by DSA (so we get them into their munin), but on which we can use sudo as we need it. (or maybe not dsa-maintained slaves, so that we can use sudo as we need, for the price of not being in DSAs munin.)
- install munin on jenkins, monitor nodes as slaves
==== next steps for jenkins.d.o migration
* add more pressure to DSA to get jenkins.d.o set up
** "having this mid september is doable, just grab some from DSA on irc"
** the machine jerea.debian.org is already setup, it just needs the the alias added.
** weasel/h01ger: install jenkins.deb from jenkins-ci.org
*** also create jenkins users in jenkins (KISS)
** install slaves - how (to automate)?
** install jenkins-job-builder "somehow", currently "only" a package from fil exists (but needs build-depends not yet even in sid atm)
** update DNS to point jenkins.d.o to jerea.d.o
*** the existing jenkins.d.o host needs to be renamed to something else (thats "just work" to do but not a major obstacle)
** j-j-b (and it's build-depends, python-jenkins) are now maintained as part of openstack. we need these packages in jessie-backports, also the j-j-b in sid/NEW currently still lacks fil/h01ger's patches from git master.
==== unsorted notes for jenkins.d.o migration
* sudoers.d/jenkins:
** not suitable for jenkins.d.o, thus we will run all tests on slaves, where DSA doesnt care what we do
* livescreenshot plugin (we use a patched version) is ok:
** jenkins maintenance probably is best done by jenkins users (as opposed to DSA) so it's up to us what plugins we install
* chroot jobs should use real schroot sessions, and not just use schroot as poor chroot(8) replacement. some links:
** https://anonscm.debian.org/cgit/mirror/dsa-puppet.git/tree/modules/schroot
** https://anonscm.debian.org/cgit/mirror/dsa-puppet.git/tree/modules/porterbox/files/dd-schroot-cmd
** https://gitweb.torproject.org/project/jenkins/tools.git/tree/slaves/linux/build-wrapper
=== To be done once jenkins.d.n runs jessie
* replace with bin/setsid.py workaround with setsid from the util-linux package from jessie
* bin/g-i-installation: use lvcreate without --virtualsize
* check if the sudo workaround in bin/g-i-installation is still needed: 'guestmount -o uid=$(id -u) -o gid=$(id -g)' would be nicer, but it doesnt work: as root, the files seem to belong to jenkins, but as jenkins they cannot be accessed.
* install botch from jessie-backports once its available (and remove botch from the reproducible-unstable schroot)
=== To be done once bugs are fixed
* link:https://bugs.debian.org/767260[#767260] workaround in bin/d-i_build.sh (console-setup doesn't support parallel build)
* link:https://bugs.debian.org/767032[#767032] manual fix in etc/munin/plugins/munin_stats
* link:https://bugs.debian.org/767100[#767100] work in progress in etc/munin/plugins/cpu
* link:https://bugs.debian.org/767018[#767018] work in progress in etc/munin/plugins/iostat_ios
* link:https://bugs.debian.org/774685[#774685] workaround in bin/reproducible_create_meta_pkg_sets.sh
=== jenkins-job-builder related
* use fil's package: http://ftp.hands.com/hands-deb/pool/main/j/jenkins-job-builder/jenkins-job-builder_1.2.0~git-0~1.gbp1ae299.dsc
** needs at least some of the stuff on fil's j.d.n.git/needs-jjb-1.2.0 branch to work with that
* investigate whether its possible nowadays to let it delete jobs which were removed.. nope. though it is possible to compare the list of all jobs which should be generated with those which are there and delete those which should not.
==== old jenkins-job-builder TODO, mostly (all?) obsolete thanks to fil's work:
* use jessie version plus h01ger's patches from kali
* change of syntax:
----
properties:
- priority-sorter:
priority: 150
----
* this seems to be helpful: http://en.wikipedia.org/wiki/YAML#References (pyyaml which jenkins-job-builder uses supports them)
* cleanup h01ger's patches (eg add documentation) and send pull requests on github:
** publisher:logparse
** publisher:htmlpublisher
** svn:scm upstreamed at https://review.openstack.org/#/c/192095/
** wrappers:live-screenshot upstreamed at https://review.openstack.org/#/c/191708/
** image-gallery: https://review.openstack.org/#/c/175747/ superseeds h01gers patch: https://review.openstack.org/#/c/191950/
** sidebar: upstreamed at https://review.openstack.org/#/c/191585/
=== livescreenshot plugin
* publish forked livescreenshot plugin and send pull request for h01ger's bugfix
** see ssh://git.debian.org/git/users/holger/livescreenshot-plugin.git and 0b407b70025 there
== lvc, work in progress, just started
* put this on debian isos too: config/chroot_local-includes/lib/live/config/9999-autotest
* add another (smaller) test: download+run torbrowser daily
* re-read the docs!
** http://live.debian.net/manual/stable/html/live-manual.en.html#321
* generate feature files from templates? to cope with sub-products?
-> no. detect desktop type and set variables accordingly
-> simpler: pass an environment variable with the type
* get iso
* tables for looping through features: see tails/iuk.git/features/download_target_file/Download_Target_File.feature
* to debug cucumber: --verbose --backtrace --expand
* drop / remove
* can probably go: dhcp.rb firewall_leaks.rb dhcp.feature firewall_leaks.feature
* more occurances of "the computer boots Tails"
* @source (only keep product tests)
* disabled stuff in common_steps.rb
** #if @vm.execute("service tor status").success?
* "I set sudo password" not needed for debianlive nor debian(edu):
** #@screen.wait("TailsGreeterAdminPassword.png", 20)
* $misc_files_dir needed?
* def sort_isos_by_creation_date
Dir.glob("#{Dir.pwd}/*.iso").sort_by {|f| tails_iso_creation_date(f)}
-> useless for us, purpose is to automatically select the latest iso if none is given
* search case-in-sensitive for tails+tor+amnesia
* put in update_jdn.sh:
----
addgroup tcpdump
dpkg-statoverride --update --add root tcpdump 754 /usr/sbin/tcpdump
setcap CAP_NET_RAW+eip /usr/sbin/tcpdump
adduser $USER tcpdump
adduser $USER libvirt
adduser $USER libvirt-qemu
----
== Improve existing tests
=== reproducible builds
* higher prio:
** more graphs:
*** graph average build duration by day
*** graph packages in testing+unstable which need to be fixed
** notes related:
*** #786396: classify issue by "toolchain" or "package" fix needed: show bugs which block a bug
*** new page with annoted packages without categorized issues
*** new page with packages that have notes with comments (which are often useful / contain solutions / low-hanging fruits for newcomers)
*** new page with notes that doesnt make sense: a.) packages which are reproducible but should not, packages that build but shouldn't, etc.
*** new page with packages which are reproducible on one arch and unreproducible on another arch (in the same suite, so unstable only atm)
*** new page with packages which ftbfs on one arch and build fine on another arch (in the same suite, so unstable only atm)
*** send notifications to maintainers when a note to their packages changes?
** new page: packages which are orphaned but have a reproducible usertagged patch
** reproducible_build.sh changes needed:
*** fix the cause for the frequent 404s or workaround it well
*** fix: "DIFFOSCOPE='E: Failed to change to directory ‘/tmp’: Permission denied" - maybe by making sure the cause is gone…
*** race condition with itself? https://jenkins.debian.net/job/reproducible_builder_armhf_6/716/console says Warning, package php5 in unstable on armhf is probably already building elsewhere, exiting. Please check https://jenkins.debian.net/job/reproducible_builder_armhf_6/716/ which does not compute.
*** deal properly with out of diskspace problems, https://jenkins.debian.net/job/reproducible_builder_amd64_8/2575/ is an archived build which hit it
*** diffoscope needs to be run on the target arch...
**** should probably be doable with qemu-static on the host
*** once all non remote build jobs are removed, remove the legacy code from _build.sh
** setup builder3, to distribute more ressources away from jenkins itself
** test coreboot/openwrt/netbsd/freebsd on a node
** explain status in plain english on each coreboot/openwrt/netbsd/freebsd page, also on the Debian dashboard plus add an "executive summary about reproducible builds in the free software world"
*** get the content for "
status of $1
" from notes.git/friends.yaml or such
** mattia: .py scripts: UDD or any db connection errors should either be retried or cause an abort (not failure!) of the job
** repo-comparison: check for binaries without source
** link source-data-epoch spec from rb.d.n
** link howto on each coreboot/openwrt/netbsd/freebsd page
** pkg sets are still amd64 only atm…
* lesser prio
** check that cleanup of old diffscope schroots on armhf+amd64 nodes works....
** check that /srv/workspace/pbuilder/ is cleaned up properly
** right align all numbers in table in the dashboard
** use build3 (or 5) as (sudo) test system to develop a dsa approvable sudoers.d
** do final s#debbindiff#diffoscope#g and s#dbd#ds#g and rename .debbindiff.html+txt files as well as the dbd directories...
** run pb2-amd64 builder 400 days in the past… (that way we will notice problems soon)
*** Acquire::Check-Valid-Until "false" should help, when setting the time to the future
** document (in README) the multihost setup
** pkg sets related:
*** fix essential set: currently it only has the ones explicitly marked Essential:yes; they and their dependencies make up the full "essential closure set" (sometimes also called pseudo-essential)
*** replace bin/reproducible_installed_on_debian.org with a proper data provider from DSA, eg https://anonscm.debian.org/cgit/mirror/debian.org.git/plain/debian/control
*** reproducible_create_meta_pkg_sets uses schroot created by dpkg_setup_schroot_jessie job (outside of reproducible job space...)
** "fork" etc/schroot/default into etc/schroot/reproducible
** move "untested" field in stats table too? (as in csv output...)
** a reproducible_log_grep_by_sql.(py|sh) would be nice, to only grep in packages with a certain status (build in the last X days)
** adopt usertag script from pkg-apparmor to notify us about new usertagged bugs automatically
** use reprepro and snapshot (reprepro gen-snapshot) on alioth to speed up our repo, maybe. maybe we'll just be in sid soon :-)
** .db: stats_build table should have package ids, not just src+suite+arch as primary key
* missing tests:
** variation in kernel
** different cpu type (coming RSN)
** variation in date (coming RSN)
** prebuilder does (user) group variation like this: https://anonscm.debian.org/cgit/reproducible/misc.git/tree/prebuilder/pbuilderhooks/A02_user
** variation of $TERM and $COLUMN (and maybe $LINES), unset in the first run, set to "linux" and "77" (and maybe "42") in the 2nd run. maybe vary $SHELL too.
*** actually TERM is set to "linux" by default already, COLUMN is unset
** variation of filesystem being built on (work in progres, see 'higher prio' list above)
** variation of users shell (bash + zsh?)
* support for arbitrary (soon to be implemented) Debian-PPAs and external repos, by just giving a source URL
* enable people to upload test packages, to be built in jenkins:
----
h01ger: another wild future request by me: allowing us to upload something and let jenkins test it. rationale: I sent (another) patch for debian-keyring, to fix a timestamp issue in debian control files (due to not_using_dh-builddeb), but there is also a umask issue. I don't want to bother me to setup the very same things jenkins tests locally (I already did too much in this regards, imho), but really people can't tests everything jenkins tests.
maybe it should just be a jenkins job not integrated into the rp.d.n webui, but... maybe we find a nice way to do it
h01ger: I'm instead thinking about a repo defining a reproducible-specific suite or something on that line, that integrates well with the current setup. but this is really something wild.
----
==== todo for reproducible remote building
* open questions:
** save build-host in build_duration table too? (and change to saving the time of a single build, not both combined)
** maintenance is general, cleanup of started but interrupted builds...
* grep for "locales-all" in build logs… (to confirm that its really used on the new build nodes)
==== reproducible Debian armhf
* make systems send mail, use port 465
* bpi0 and hb0 should run in the future…
==== status of new remote build nodes for amd64
* profitbricks-build2-amd64
** should run in the future…
** needs to be moved to a different Profitbricks datacenter, to run with a different cpu type
* profitbricks-build2-amd64
** should have one more cpu then build2, so we dont need to run with NUM_CPU-1
==== reproducible Debian installation
* see https://wiki.debian.org/ReproducibleInstalls
* add the test (something weekly or so)
==== reproducible coreboot
* add more variations: domain+hostname, uid+gid, USER, UTS namespace
* build the docs?
* also build with payloads. x86 use seabios as default, arm boards dont have a default. grub is another payload. and these: bayou coreinfo external filo libpayload nvramcui - and:
** CONFIG_PAYLOAD_NONE=y
** CONFIG_PAYLOAD_ELF is not set
** CONFIG_PAYLOAD_LINUX is not set
** CONFIG_PAYLOAD_SEABIOS is not set
** CONFIG_PAYLOAD_FILO is not set
** CONFIG_PAYLOAD_GRUB2 is not set
** CONFIG_PAYLOAD_TIANOCORE is not set
* libreboot ships images, verify those?
* explain status in plain english
* use disorderfs for 2nd build?
==== reproducible OpenWrt
* add credit for logo/artwork
* build more archs (http://downloads.openwrt.org/chaos_calmer/15.05-rc1/ lists many to choose from)
* build all packages? (set CONFIG_ALL=y and run 'make defconfig')
** just build some first...
* file dbd bug about unable to inspect these .bin files
* file dbd bug about crashing on certain squashfs files
* explain status in plain english
* use disorderfs for 2nd build?
==== reproducible NetBSD
* announce on their list
* explain status in plain english
** MKREPRO is set to "yes"
* use disorderfs for 2nd build?
==== reproducible Fedora
* use mock to create a fedora chroot to build in
** http://blog.packagecloud.io/eng/2015/05/11/building-rpm-packages-with-mock/
** http://blog.packagecloud.io/eng/2015/04/20/working-with-source-rpms/
* start with building a single package (which is reproducible on Debian), only build that one, until its reproducible
** then eventually build the full base system (100-500 packages), once that package is reprodcuible (aka the rpm toolchain has been fixed...)
* maybe call the script reproducible_rpms.sh and also let it build OpenSuSE packages?
* document in the initial webpage, that we don't have a clear idea yet, how to record+reproduce the build environment. +that this is essential for reproducible builds too.
==== reproducible FreeBSD
* useful improvements:
** investigate how to use tmpfs on freebsd and build there
** upgrade the VM to FreeBSD 10.2
** find a way to be informed about updates and keep it updated
** build master branch instead of release/10.2.0
** modify PATH, uid, gid and USER too, maybe host+domainname as well. (The VM is only used for this, so we could change the host+domainname before building... probably even the date)
** add freebsd vm as node to jenkins and run the script directly there, saves lot of ssh hassle
* random notes, to be moved to README
** we build freebsd 10.1 (=released) atm
** we build with sudo too
*** rather not change /usr/obj to be '~jenkins/obj' and build with WITH_INSTALL_AS_USER. also not build in /usr/src. if so, we need to define some variable so we can do so.... but we need a stable path anyway, so whats the point.
*** maybe build as user in /usr/src...
* first build world, later build ports (pkg info...)
* document how the freebsd build VM was set up:
** base 10.1 install following https://www.urbas.eu/freebsd-10-and-profitbricks/
** modified files:
*** /etc/rc.conf
*** /etc/resolv.conf
*** /boot/loader.conf.local
** pkg install screen git vim sudo denyhosts munin-node
** adduser holger
** adduser jenkins (with bash as default shell)
** mkdir -p /srv/reproducible-results
** chown -R jenkins:jenkins /srv/
==== reproducible...
* openembedded.org!
* Arch? Gentoo?
=== qa.debian.org*
* udd-versionskew: explain jobs in README
* udd-versionskew: also provide arch-relative version numbers in output too
=== d-i_manual*
* d-i_check_jobs.sh: check for removed manuals (but with existing jobs) missing
* svn:trunk/manual/po triggers the full build, should trigger language specific builds.
* svn:trunk/manual is all thats needed, not whole svn:trunk
=== d-i_build*
* d-i_check_jobs.sh: check for removed package (but with existing jobs) missing
* build packages using jenkins-debian-glue and not with the custom scripts used today?
* run scripts/digress/ ?
* bubulle wrote: "Another interesting target would be d-i builds *including non uploaded packages* (something like "d-i from git repositories" images). That would in some way require to create a quite specific image, with all udebs (while netboot only has udebs needed before one gets a working network setup).
=== chroot-installation_*
* use schroot for chroot-installation, stop using plain chroot everywhere
* add alternative tests with aptitude and possible apt
* split etc/schroot/default
* inform debian-devel@l.d.o or -qa@?
* warn about transitional packages installed (on non-upgrades only)
* install all the tasks "instead", thats rather easy nowadays as all task packages are called "task*".
** make sure this includes blends
=== g-i-installation_*
Development of these tests has stopped. In future the 'lvc*' tests should replace them.
These small changes are probably still worth doing anyway:
* g-i: replace '--' with '---' as param delimiter. see #776763 / 5df5b95908 in d-e-c
* download .isos once in central place
** /var/lib/jenkins/jobs/g-i-installation_*/workspace/*iso needs 53GB currently, it could be 30 less
* g-i_presentation: use preseeding files on jenkins.d.n and not hands.com
* turn job-cfg/g-i.yaml into .yaml.py
The following ideas should really only be implemented for the new 'lvc*' tests.... (but are kept here for now)
* pick LANG from predefined list at random - if last build was not successful or unstable fall back to English
** these jobs would not need to do an install, just booting them in rescue mode is probably enough
* for edu mainservers running as servers for workstations etc: "d-i partman-auto/choose_recipe select atomic" to be able to use smaller disk images
** same usecase: -monitor none -nographic -serial stdio
== Further ideas...
=== rebuild sid completly on demand
* nthykier wants to be able to rebuild all of sid to test how changes to eg lintian, debhelper, cdbs, gcc affect the archive:
* h01ger> | nthykier: so a.) rebuild everything from sid plus custom repo. b.) option to only rebuild a subset, like all rdepends or all packages build-depending on something
* h01ger> | and c.) only build once, not continously and d.) enable more cores+ram on demand to build faster
=== Test them all
* build packages from all team repos on alioth with jenkins-debian-glue on team request (eg, via a .txt file in a git.repo) for specific branches (which shall also be automated, eg. to be able to only have squeeze+sid branches build, but not all other branches.)
== Debian Packaging related
This setup should come as a Debian source package...
* /usr/sbin/jenkins.debian.net-setup needs to be written
* what update-j.d.n.sh does, needs to be put elsewhere...
* debian/copyright is incorrect about some licenses:
** the profitbricks+debian+jenkins logos
** the preseeding files
** ./feature/ is gpl3
// vim: set filetype=asciidoc: