FreeBSD Project "ideas" List
Over the years, the FreeBSD Project has built up a list of ideas for implementation work that seems like it might be a good idea, but no hands are available to do it. If you would like to contribute to the FreeBSD Project, you might peruse this list to get a sense of the kinds of work available to do. Obviously, contributions are not limited to this list!
Please do contact us before starting on it though -- sometimes items remain on the list after they are completed, and sometimes they are just ideas, rather than a recipe for success. Searching our mailing list archives may turn up the discussions leading to ideas being put on the list. Frequently the initial goal would be to simply investigate the idea, rather than produce code. Many project ideas list contacts, who it is worth sending an e-mail. Otherwise (and perhaps as well), send e-mail to our hackers@ mailing list.
For Google Summer of Code students:
For the list of Google Summer of Code ideas, visit: SummerOfCodeIdeas
Please note: ideas on this list are generally considered unsuitable for Summer of Code projects for many possible reasons (too large, too small, no available mentor, etc) but there may still be ideas here that could form the basis of a good GSoC task.
Contents
- FreeBSD Project "ideas" List
- Wireless Projects
- Embedded Projects
- File System Projects
- Kernel Projects
- Virtualization Projects
- Networking Projects
- Porting Projects
- Testing and Continuous Integration projects
-
Userland / Installation Tools Projects
- Switch procstat from subcommand flags to verbs
- BSD-licensed ELF Tools
- BSD-licensed Text-Processing Tools
- NDMP data server
- Port prebind from OpenBSD
- Proxy auto-config file support for libfetch
- PXE Installer
- Improve cron(8) and atrun(8)
- libpw
- resurrect memory leak detector libmprof
- Safe crash dumps
- Import syslogd improvements from NetBSD
- Add support for usbdump file-format to wireshark and vusb-analyzer
- Support for setting base system build options via dialog(1)
- RAID and disk monitoring suite
- Cross-building FreeBSD from Linux and/or Mac OSX
- Global Projects (may touch everything)
- Other Projects
Wireless Projects
These are now on WifiIdeasPage.
Embedded Projects
Reduced FreeBSD kernel size for embedded
Technical Contact: imp@
Description
The FreeBSD kernel has been optimized over the years for a server or workstation environment. Memory is plentiful in these environments, so little attention was given to the size of the kernel. There's a number items in the kernel that can be made optional without reducing affecting the functionality needed in an embedded environment. These include things like not compiling in strings into the kernel, less agressively inlining code, making some non-optional features optional and investigating compile time flags. This task requires identifying potentially optional kernel content and building the infrastructure to make that content optional.
Requirements
- Strong C language programming skills.
- Understanding of system calls and general kernel architecture.
Make creating a bus easier
Technical Contact: imp@
Description
There's about a dozen busses in the tree now that manage resources and activate children. They are far too hard to create. We need to abstract out the basics for these buses and provide a way to allow these buses to be a subclass of this new base class.
Requirements
- Good C programming skills
Good knowledge of the NewBus configuration system in FreeBSD
Variable hints
Technical Contact: imp@
Description
Often times in the embedded world, you know what kind of built-in devices are on a SoC (System on a Chip) only because you know the specific model of that SoC. It is desirable to have a mechanism that code on these machines can use to load one of several sets of hints, which can then be used to populate the bus.
Requirements
* Good C programming skills
ARM cleanup
Technical Contact: imp@
Description
Adding a new board to the arm code is a lot harder than it needs to be. A lot of benefit could be had by creating tables for memory ranges, etc, and having more generic initialization code. Much of this can also be Machine Independent (MI).
Requirements
- Good C programming skills
- Good refactoring skills
File System Projects
Improve the performance of dump/restore
Description
A performance evaluation of the split cache (as is) and an unified cache (like e.g. NetBSD) would be interesting. More details in this mail to the hackers mailing list. Additional improvements are welcome too.
Requirements
- Knowledge of C programming.
- Basic understanding of backup/restore procedures.
Filesystem decompression layer
Solaris 10 and newer provide dcfs; a read-only decompression stacking filesystem layer for UFS. Files are initially compressed by a userland fiocompress utility. The filesystem layer is very simple, it is implemented in single file that permits transparent decompression without the end user knowing if the file is compressed or not. While the implementation is really simple it is very useful by making possible to install in systems with little memory or for quick compression of files in typical read-only directories like /usr/bin. While the Solaris fiocompress implementation uses zlib, it would be easy to use more modern algorithms like lz4 or snappy.
References: File-System Development with Stackable Layers
Bring back the RAIDframe port
https://people.freebsd.org/~scottl/rf/
Description
NetBSD has a software RAID framework that is very useful for prototyping or otherwise implementing RAID solutions. Around FreeBSD 5 there was an heroic effort to port it, hovever, the intrfaces to devices and disks was changed and the port never caught up so the code was removed from FreeBSD in Revision 127066.
Requirements
- Ability to read and understand foreign C code, familiarity with filesystems is a plus.
- knowledge of geom(4).
- Willingness to experiment and debug filesystems.
OpenBFS
Practical Filesystem Design Book
Description
Before working at Apple, Dominic Giampaolo wrote a 64 bit journalling filesystem for BeOS. The objectives were to have a modern filesystem capable of streaming for multimedia applications. It also supports indexing and Extended Attributes for desktop use. Haiku has implemented a version of the filesystem based on the documentation which is used as their base filesystem and is available under an MIT license. Porting it, or rather using the existing implementation as the basis for a new one, would make an interesting case for an alternative filesystem.
Requirements
- Ability to read and understand foreign C code, familiarity with FreeBSD's VFS is a plus.
- Ability to interpret results from testsuites and find solutions.
- Knowledge of filesystem internals.
Porting HFS+
Description
The Hierarchical File System was developed by Apple Inc. for use in MacOS. With the Release of MacOS X it received many new features, and the source code was made available as part of XNU. An initial FreeBSD 5.3 HFS port was made and although it was subsequently abandoned and support for locking has to be added, it would be excellent reference material for an updated port. A port would also be a good reference for bringing other interesting filesystems from Apple's Darwin.
Requirements
- Strong knowledge of C.
- Understanding of FreeBSD's filesystem interfacing and VFS.
Kernel Projects
Document all sysctls
Technical Contact: mat@, brd@, eadler@
Description
The sysctl(8) utility retrieves kernel states and allows processes with appropriate privilege to change kernel states. On request it is able to display description lines which document the kernel state. Unfortunately not every sysctl is documented. This task is possible to share with other volunteers. mat has done some development in Perforce, in the mat_sysctl_cleanup branch.
- Find every undocumented sysctl in the kernel.
- Try to determine what this sysctl is for and document it.
Requirements
- Ability to read and understand foreign C code.
Document the sound subsystem
Technical Contact: netchild@
Description
- Add sound subsystem related section 9 manual pages, so far no sound subsystem related manual pages exists.
Add an example driver in share/examples which allows to write a new driver. For this purpose the example driver should contain enough documentation as comments and/or pointers to documentation in man-section 9. This work can be based upon http://people.freebsd.org/~cg/template.c.
- Rewrite the sound subsystem chapter in the FreeBSD Architecture Handbook. The rewrite should contain an overview of the available parts in the sound subsystem and how they interact (data flow, dependencies, ...) and fit together. Additionally it should contain links to already available documentation (official standards, section 9 manual pages, ...).
The page soundsystem which documents everything related to the sound subsystem in FreeBSD has been created.
Requirements
- Ability to read and understand foreign C code.
- Documentation writing skills.
Kernel fuzzing suite
Description
FreeBSD's memguard(9), and the compiler stack protection offer a good framework to detect memory leaks and buffer overflows in the kernel and the complete OS is frequently checked with static analysis tools, but we lack kernel specific fuzzer testing tools to aid in such detection. Originally the linux Trinity fuzzer was the main example of such tool, Dmitry Vyukov's syzkaller is somewhat more promising, and as of lately there is also the TriForceAFL used for OpenBSD.
A native tool would be good but perhaps just running the Trinity tool under the linux emulator, along with memguard(9), would reveal general bugs in the kernel.
Reference: syzkaller for freebsd posting
DTrace
Technical Contact: rwatson@
Homepage: Perforce repository, DTrace for FreeBSD
DTrace is a dynamic tracing facility designed by Sun Microsystems and released in Solaris 10. They have since released the major part of Solaris under the banner of OpenSolaris and the Common Development and Distribution License (CDDL) 1.0.
- We need a clean CTF implementation for FreeBSD to avoid the license (CDDL) that Sun has on their code. A specification about the file format needs to be written, and someone who never looked at the Sun code (and doesn't while doing the work) would have to implement that and write tests for the implementation.
We can always use moreProviders .
Requirements
- Ability to read and understand foreign C code.
- Ability to write C code.
- A good understanding of the FreeBSD kernel.
DWARF2 call frame information (GSoC 2011)
Description
A debug kernel is not able to show stack traces with cross exceptions anymore. This is because we do not emit any dwarf2 call frame information for any assembler code, since gdb switched to the dwarf2 format. A volunteer should annotate every assembler file [*.[sS]] with dwarf2 call frame information. This was started as a GSoC but needs more work.
Requirements
- Knowledge of assembly code.
- Knowledge of ".cfi_*" pseudo-ops to insert dwarf2 frame descriptors.
Implement support for kernel Address Sanitizer
The kernel Address Sanitizer KAsan is a dynamic memory error detector developed by Google based on the similar userland tool. It provides a fast and comprehensive solution for finding use-after-free and out-of-bounds bugs. It requires instrumentation in the kernel.
Requirements
- Knowledge of kernel memory allocation.
Suspend to disk
Implement a suspend/resume from disk mechanism. Possibly use the dump functions to dump pages to disk, then use ACPI to put the system in S4 or power-off. Resume would require changes to the loader to load the memory image directly and then begin executing again.
Requirements
- Good knowledge of C.
- Understanding of the hardware/software interface.
- A laptop that works with ACPI.
- Kernel awareness.
Sync FreeBSD i386 boot code with DragonFly
Technical Contact: jhb@
DragonFly invested a lot of time to clean up and document it. Additionally they fixed some bugs. Interesting files in the DragonFly CVS are sys/boot/i386/bootasm.h, sys/boot/i386/bootasmdef.c, sys/boot/boot0/*, sys/boot/boot2/*, sys/boot/i386/btx/*, sys/boot/i386/cdboot/*, sys/boot/i386/libi386/amd64_tramp.S, sys/boot/i386/libi386/biosdisk.c and sys/boot/i386/loader/main.c. An interested volunteer has to compare and evaluate both implementations and port interesting/good parts.
Requirements
- Ability to read and understand foreign C code.
- Ability to write C code.
- Knowledge of i386 assembly.
- Knowledge of BIOS interfaces.
- Knowledge of low-level boot behavior.
Solaris Doors IPC Implementation
Fast Sockets, An IPC Library to Boost Application Performance
An Implementation of the Solaris Doors API for Linux
Doors provide a mechanism for processes to issue remote procedure calls to functions in other processes running on the same system. The door APIs were developed by Sun Microsystems as a core part of the Spring operating system and were officially available in Solaris 2.6. They are extensively used in Illumos.
In addition to the Solaris/Illumos port, which is well documented, there is also an outdated implementation for linux that can serve for comparison. The project would consist in understanding how the existing code works and designing a completely new, but compatible, implementation for FreeBSD.
Requirements
- Interest in Inter-process Communications
- Ability to understand code and the existing implementations in Illumos and linux.
- Capacity to run your own tests and benchmarks.
Userspace mount() implementation
Technical Contact: brooks@
The mount() system call has a badly designed interface, is obsolete, and has largely been replaced by the nmount() system call. It should be straightforward to implement a user space wrapper that parses the passed structure and constructs an nmount() call. This should allow mount() to be removed on new architectures and space constrained systems.
Requirements
- Ability to read and understand foreign C code.
- Ability to write C code.
- Ability to write tests.
Virtualization Projects
bhyve gdb-stub/dcons integration
FreeBSD's 'bhyve' Hypervisor has a feature where it allows the kernel debugger to communicate with the outside world via a socket. Unfortunately it is quite slow, being polled one byte at a time. In addition it still uses the kernel on the VM to do all the work. Some avenues of improvement could include using the existing dcons memory buffer driver so that a whole buffer might be transferred at a time, or to make the current driver much faster. One might also intercept some of the commands (memory read for example) and perform them directly in the hypervisor so that cooperation of the virtualized system is not required to examine memory. It is even possible that by manipulating the processor flags appropriately, one could single step or 'break' the guest without its cooperation at all.
Requirements
- Willingness to dig into OS internals
- Knowledge of i486/amd64 assembler and virtualization concepts
- ability to decode intel/AMD processor specs
VirtualBox shared folder support for FreeBSD guests
Technical Contact: lwhsu@, gonzo@, decke@, vbox@
Description
Oracle VirtualBox does unofficially support FreeBSD as host and guest operating system. VirtualBox shared folder support allows to access folders of your host from within the guest system. This is similar how you would use network shares in Windows networks except that shared folders do not need require networking, only the VirtualBox Guest Additions.
This task was part of GSoC 2013 where the main focus was on porting the code to FreeBSD. The result was almost working read only support but more bugs need to be fixed and read write support needs to be added and tested.
Requirements
- Ability to read and understand foreign C code
- Ability to write C code
- Knowledge of the VFS subsystem
Networking Projects
SCPS, Space Communication Protocol Standards
SCPS is a protocol suite designed to allow communication over challenging environments. Originally developed jointly by NASA and DoD's USSPACECOM, these protocols are used for commercial, educational, and military environments. A student project in this area would involve implementing various network protocols according to specification (SCPS File Protocol, similar to FTP; SCPS-Transport Protocol, based on TCP; and others.)
Note that European Space Agency has now an ESA Summer of Code in Space program and while FreeBSD is not a mentoring organization, interested students could motivate such a process.
References: Consultative Committee for Space Data Systems
Requirements
- Good knowledge of C and TCP.
- Able to understand the FreeBSD TCP/IP stack.
- A testbed with at least two machines.
Porting Projects
Port FreeBSD to new platforms
Porting to new platforms is a good way to learn the internals of FreeBSD and serves to check the general portability of the base system. While there are important efforts to maintain an external toolchain working it would be ideal to start working on newer platforms that are already supported by the toolchain in base and where the hardware support is either easy to find or available through emulation.
References:
Requirements
- Good Knowledge of C and assembler of the target platform.
- Familiarity with (cross)building FreeBSD.
Port FreeBSD on Tablet device
Porting FreeBSD to arm, intel tablets and next mobile device.
Requirements
- Some kernel experience
Testing and Continuous Integration projects
POSIX compliance testing framework
Description
Standards compliance has always been one of the main objectives for the FreeBSD project. In the past we had some efforts to follow regular testing procedures but we haven't followed up the efforts with proper and sustainable infrastructure.
The Open Group has made some testsuites freely available. In particular there used to be a FreeBSD port of the TET testing suite, but this was removed due to a lack of maintenance and other issues. The lsb-vsx Linux testsuite should also be cleaned up so that together we are able to do regular testing on both linux-emulation and FreeBSD-native support.
Any compliance issue should be reported in Bugzilla and an attempt to contact the respective group should be made to draw a plan towards compliance.
Travis Continuous Integration Support for FreeBSD
Technical Contact: rodrigc@,
Description
Travis Continuous Integration is a very popular Continuous Integration system used by projects hosted on GitHub. If a GitHub project has a .travis.yml config file in the root directory, the Travis system will build and test the project if new code is committed to the GitHub project.
Currently, Travis only supports Linux and MacOS X. The Travis project closed issue 1818 which was a request to add FreeBSD support to Travis, due to lack of resources. However, having this support would be very useful, and allow FreeBSD to test many third party projects on GitHub.
Requirements
- Knowledge of Shell scripting in /bin/sh, Python, Ruby
Basic knowledge of REST API's and the GitHub API
Knowledge of virtualization systems, such as bhyve, QEMU, KVM
Userland / Installation Tools Projects
Switch procstat from subcommand flags to verbs
Technical Contact: brooks@
Description
The procstat command has a number of flags which are in practice subcommands. The single letter namespace means that new flags often have poor mnemonics and locally added commands often end up colliding with newly added values. Procstat should be modified to take command verbs in addition to existing flags. Argument handling should be table driven to make it easy to add arguments with minimal merge conflicts.
Requirements
- Knowledge of C.
BSD-licensed ELF Tools
Technical Contact: jkoshy@, kaiw@
Create BSD-licensed versions of ELF processing tools (e.g., ld, dbx, as and others) using the ELF(3) and GELF(3) API set. Identify overlapping functions in those tools and create a library out of the common functions. Identify parts which can be generated by tools (e.g., machine code parser generators) to support our Tier-1 and Tier-2 architectures.
References:
Requirements
- Knowledge of C.
BSD-licensed Text-Processing Tools
Part of Summer of Code 2010, Part of Summer of Code 2008
Technical Contact: gabor@
grep: It has been committed to the base system and available as an alternative of GNU grep. The compatibility is good but the performance is quite behind GNU grep, which prevents us from using it as a default. There are also some problems of regular expressions involved. It is under active development by gabor@.
diff/diff3/sdiff: Many command-line options are supported but some features are still missing. Maybe the three programs can be integrated into a single binary, this should be evaluated. A thorough performance benchmark should also be done. See SummerOfCode2012/JesseHagewood for last status.
mdocml: Some groff features are very hard to implement but they aren't strictly needed to render our man pages. Yet some manuals do not compile with mdocml. Investigate the reasons and create a migration plan.
Requirements
- Knowledge of C.
NDMP data server
URL: The NDMP Initiative
The NDMP initiative was launched to create an open standard protocol for network-based backup for network-attached storage. Major commercial storage systems come with a compliant service. This allows major commercial backup systems to backup such NAS devices. Including a NDMP disk server into FreeBSD would allow to play nice out of the box (modulo some configuring) regarding backups in a corporate environment.
- Evaluate the existing revisions of the NDMP standard.
- Choose an appropriate revision (after checking of supported versions in commercial backup systems).
- Implement at least a NDMP data server.
- Bonus: implement a NDMP tape server (to allow attached tapes to be used).
Requirements
- Access to a commercial backup system with NDMP support (mostly for interoperability testing; since a NDMPcopy application seems to be available, this is not a hard requirement).
- Good knowledge of a programming language which is included in the base system.
- Knowledge about UFS snapshots.
Port prebind from OpenBSD
The OpenBSD prebind is a secure implementation of prelinking that is compatible with address space randomization. Prelinking allows to speed up application startup when a lot of libraries are involved. This should show a noticeable effect with e.g. GNOME/KDE.
Requirements
- Good C knowledge (reading and writing).
Proxy auto-config file support for libfetch
A proxy auto-config (PAC) file contains a JavaScript function "FindProxyForURL(url, host)" that determines which HTTP or SOCKS proxy, if any, to use to access a given URL. In most application the file may be specified manually or discovered using the Web Proxy Autodiscovery Protocol. Support for PAC files in libfetch would make fetch more versitle.
Supporting PAC files nominally requires a fairly complete JavaScript implementation. Google's V8 JavaScript engine is BSD Licensed, however it compiles code to native machine code so platform support is an issue. However, the parser etc may provide a good starting point, and other engines may also exist and should be evaluated. A minimalist implementation of the language with commonly used constructs such as if/else, string comparison, and functions would be sufficient in many cases.
References:
Requirements
- Strong knowledge of secure C programming.
PXE Installer
It would be great to have a bundled PXE installer. This would allow one to boot an install server from a FreeSBIE live CD-ROM on one box, set the BIOS on subsequent boxes to PXE boot, and then have the rest happen by magic. This would be very helpful for installing cluster nodes, etc.
m@ is working on a bundled PXE installer as part of his BSDInstaller project within the Google Summer of Code 2006. The PXE Installer is working but some non-PXE related issues have to be solved before it can enter the tree.
Requirements
- Good PXE knowledge.
Improve cron(8) and atrun(8)
Currently, cron(8) and atrun(8) are outdated in their implementation. Here are some directions for improvement:
- Update cron(8) to ISC cron with security fixes from OpenBSD.
- Integrate the atrun(8) functionality into cron(8), as it was done in NetBSD.
Requirements
- Strong knowledge of the C language and Unix API.
libpw
Technical Contact:
Create a library to be able to manage users/groups easily, it also should have a pam/nss-like plugin framework for different account system.
libutil has pw_* and gr_* undocumented functions which allow user/group manipulation
writing pw_*() and gr_*() manpages
known users for that library: pkgng, pw(8)
Requirements
- Knowledge of C
- Knowledge of pam/nss
resurrect memory leak detector libmprof
Technical Contact: julian@
There used to be, many years ago, a port called mprof, which would do one thing, but do it very well: Find memory leaks. Unfortunately it has bitrotted and no longer works. It reads symbols from the executable, and correlates it with a detailed memory allocation trace to produce a very useful memory allocation leak descrition.
Requirements
- be interested in the linker and object file formats including debug symbols.
- know C.
Safe crash dumps
Technical Contact: gavin@
Crash dumps are important for collecting debugging information but are also disabled by default because they can consume much space in /var if the user doesn't pay attention to them.
What we miss is a safe way to enable crash dumps by default without having to worry about them filling up /var.
Requirements
- Knowledge of C
- Knowledge of the FreeBSD system internals
Import syslogd improvements from NetBSD
Technical Contacts: emaste@, markj@
Note - there is work in progress on this project, please contact Ed or Mark for details.
NetBSD's syslogd has a number of improvements from a former Google Summer of Code project available for porting:
- new syslog protocol api syslogp(3) that supports structured data and draft-rfc timestamps
- reliable tcp connections with queueing
- encrypted connections
The changes are in NetBSD's repository in src/usr.sbin/syslogd, available for viewing on their cvsweb:
Requirements
- Good knowledge about the C programming language.
Add support for usbdump file-format to wireshark and vusb-analyzer
Technical Contact: hselasky@
Support for the usbdump file-format has now been added to wireshark:
https://wiki.freebsd.org/SummerOfCode2017/usbdump-wireshark
Support for setting base system build options via dialog(1)
Technical Contact: ??
Our ports support setting build options via dialog(1) for ages. Recently, with the pkgng invention, it was made possible also to pack port's build options into binary package - later, a functionality to track dependencies based on the build options may be added.
Our base system build infrastructure now prefers consistent /etc/src.conf options in favor to older ad-hoc /etc/make.conf options. Currently, there are only binary ones, thus possible to map to dialog(1) checkboxes. The idea is to be able to do cd /usr/src && make config to see familiar dialog(1) interface - just as in any port in /usr/ports (it should be complemented with usual make showconfig and make rmconfig, of course).
This idea, amongst direct simplification of average user's life, is valuable in regard to possible future packaging of the FreeBSD base system to one or a few pkgng packages. Such packaging is desired in some environments of large production server farms for easier managing/upgrading servers, see https://github.com/z0nt/pkg for such a project. In the future, however, this could be useful for tracking pkgng packages dependencies based on world's build options (e.g. a port requires a world built with WITH_IDEA or not built WITHOUT_PF, etc.).
@bapt: note that it is already possible to store the informations about the options used to build base using the actually framework and without the need of the dialog(1) interface.
Requirements
- Good knowledge of make and shell code
- Knowledge of the FreeBSD build and installation infrastructure
RAID and disk monitoring suite
Technical Contact:
There have been several organizations that have independently developed RAID and disk failure monitoring tools. These should be gathered together into a unified group of daemons and monitoring scripts to provide a consistent view of disk status.
http://svnweb.freebsd.org/base/user/sbruno/ard/ http://svnweb.freebsd.org/base/user/sbruno/mfid/ http://svnweb.freebsd.org/base/user/sbruno/mptd/
Requirements
- Knowledge of /etc/rc.d scripts and ordering
- Basic understanding of C and UNIX
- Shell scripting in /bin/sh
Cross-building FreeBSD from Linux and/or Mac OSX
Technical Contact: brooks@
FreeBSD's build system is self-contained, but only really designed to be run on FreeBSD as it makes assumptions about the host platform and available tools. Being able to build a fully-functional FreeBSD userland and kernel from either Linux or Mac OSX without having to use FreeBSD inside a virtual machine would allow more people to make use of and build products out of FreeBSD easier.
The NetBSD approach of bootstrapping via build.sh is one way to go. Another is to make autoconf versions of the key build tools and ideally add them to debian/macports/homebrew/etc so users can install the set of things they need and build from there. The bmake program is typically already available (perhaps in a somewhat old form so it may be viable to use in the bootstrap process).
On Mac OSX an additional complication exists in that the default filesystem is case-insensitive. At least to begin with, creating a case-sensitive filesystem to do this work on is recommended. Some work has already been done here, and it may be desirable to build upon that.
Requirements
- Knowledge of build infrastructure, Makefiles, etc
- Knowledge of compilers and linkers
- Access to a Linux or Mac OSX machine (for OSX, use a case-sensitive filesystem)
Global Projects (may touch everything)
EPUB Support in Documentation Build Infrastructure
Suggested Summer of Code project idea
Enhance the FreeBSD Documentation Project build infrastructure to generate EPUB format output suitable for eBook readers from such as iPads and Kindles.
Requirements
- Ability to work with Makefiles
- Knowledge of SGML/XML transforms
PerfVisor (PERFormance adVISOR)
Coordination: netchild@
The goal of this project is to get a tool which can analyze a system for performance bottlenecks and maybe even give some hints what to do next.
This is not a GSoC project. Maybe small parts of it can be done during a GSoC.
Prerequisites you should know/read to understand the project steps:
http://dtrace.org/blogs/brendan/2012/12/13/usenix-lisa-2012-performance-analysis-methodology/
http://queue.acm.org/detail.cfm?id=2413037 or http://dtrace.org/blogs/brendan/2012/02/29/the-use-method/
http://dtrace.org/blogs/brendan/2012/03/01/the-use-method-solaris-performance-checklist/
http://dtrace.org/blogs/brendan/2012/03/07/the-use-method-linux-performance-checklist/
http://www.slideshare.net/brendangregg/zfsperftools2012 or http://www.brendangregg.com/Slides/zfsperftools2012.pdf
http://dtrace.org/blogs/brendan/2012/12/10/usenix-lisa-2010-visualizations-for-performance-analysis/
http://dtrace.org/blogs/brendan/2012/03/26/subsecond-offset-heat-maps/
and http://dtrace.org/blogs/brendan/ in general...
Project phases
Initial phase
- Identify/list possible resources in FreeBSD. See the performance checklist postings above for ideas. Those are just a start, this can iteratively be extended to include subitems (e.g. in a first iteration the "network stack" could be an item of a resource, in a second iteration each network interface is a resource, on a third iteration one network protocol could be a resource, on a fourth iteration just the TX queue of a NIC can be a resource, ...). Not everything needs to be identified initially, at one point the work on this resource list should be pushed back to a point where all or some of the following items are handled for the existing resources. When all OS resources are listed, further iterations could even include 3rd party applications (e.g. apache httpd requests, mysql table scans, ...). Interesting values for each resource can be bandwidth (total/percentage), operations per second, latency, usage duration, time until completion, ... Ideally the result is written down in the FreeBSD wiki. For each "high-level" resource (e.g. network stack) try to document some kind of "where to look next if saturated/on-error/..." resource (e.g. network interface, protocol checks, ...)
- For each resource identify/document "utilization", "saturation" and "error" values (see the USE method in the prerequisites), and code points in the source which show them. If there's no code point in the source, mark the resource for later investigation. Hardware provided info, e.g. CPU counters, count as as "code points" in this sense too.
- For each resource determine/document existing tools which provide the required information (saturation/error) and how to use them to get the required info.
- For each resource without an existing way to get the required information, but with a code point in the source which provides the required information, write a dtrace/hwpmc/whatever script/program/whatever to query the required information.
- For each resource without a code point in the source which provides the required information, find a "cheap" way to determine the resource value and add corresponding code readable by sysctl or dtrace (readout variable on existing probe or a specific SDT probe just for this) or whatever (and write a script/program/whatever as above).
Improvement phase
- For each existing code point with an existing script check if a new way of handling this info (sysctl, dtrace probe, ...) would make the sampling of this data cheaper (= less performance impact of the performance monitoring itself).
Visualization phase
- Write a tool (for the ports collection) which visualizes data (heat maps, flame graphs, ...) gathered from the above performance data collection scripts (either stored somewhere or by calling the scripts directly) and provides hints where to look next in case a resource is the bottleneck. The blog of Brendan Gregg gives a lot of ideas how to visualize with various types of graphs. Suggestion: make it system independent, foresee use on *BSD, Solaris and Linux.
Cloud phase (everything is cloudy nowadays...)
- Write an agent which is able to collect the data which the visualizing tool is able to handle and extend the visualizing tool to collect data from agents on other machines (via an encryted and authenticated connection if needed = possible but not mandatory, depending on the environment).
Datawarehouse phase (buzzword bingo!)
- Write a gateway/collector service which a) is able to "handle" agents in a datacenter (query and store data), b) is able to handle bulk-data-retrieval requests from remote locations (transfer data from remote datacenters to the company headquater), and c) is able to store/archive queried data for later use ("How did this look last week/month/year?").
Discussion
The first item in the initial phase is a big item. It may be better to see the complete initial phase as an iterative or divide and conquer step. First determine some high level resources and process all steps of the initial phase on them. Then repeat the initial phase again by braking up the high level resources in more detailed items (e.g. first "all CPU in system" as one resource, second "each CPU in system" as a resrouce, third breaking up each CPU via hardware performance counters).
GSoC info
Each phase is a big project of its own. Do not expect to be able to do it during some weekends or during a GSoC. If you think you can, you did not get the full scope, think again. If you still insinst after rethinking, be our guest, but you better prepare a very good outline what you want to do when, how, at which level of detail (e.g. come with a list of resources for the first item in the initial phase), and also include what you will NOT do/cover/handle but could be related (e.g. CPU internal performance counters if you want to handle the CPU performance side of this). You already need to know FreeBSD (you use it already on your server or desktop since several months) and you shall not be afraid to ask questions or discuss on the mailinglists. This is not a project for 2h per day, if you are not motivated to spend time on this, you better chose a different topic.
This can be made "big enough" to be a project suitable for finals at your university (depending on the requirements of your university, with or without extending it over the end of the GSoC).
Other Projects
If you are interested in working on a project not explicitly mentioned above, you may want to contact one of the potential Technical Contacts below:
Additionally, there are a lot of interesting mailing lists that can be used when searching information about specific subjects.
For porting ideas go to WantedPorts and for Beginner tasks check out JuniorJobs