HOW TO GET YOUR CODE ACCEPTED IN HAPROXY READ THIS CAREFULLY BEFORE SUBMITTING CODE THIS DOCUMENT PROVIDES SOME RULES TO FOLLOW WHEN SENDING CONTRIBUTIONS. PATCHES NOT FOLLOWING THESE RULES WILL SIMPLY BE IGNORED IN ORDER TO PROTECT ALL OTHER RESPECTFUL CONTRIBUTORS' VALUABLE TIME. Abstract -------- If you have never contributed to HAProxy before, or if you did so and noticed that nobody seems to be interested in reviewing your submission, please do read this long document carefully. HAProxy maintainers are particularly demanding on respecting certain simple rules related to general code and documentation style as well as splitting your patches and providing high quality commit messages. The reason behind this is that your patch will be met multiple times in the future, when doing some backporting work or when bisecting a bug, and it is critical that anyone can quickly decide if the patch is right, wrong, if it misses something, if it must be reverted or needs to be backported. Maintainers are generally benevolent with newcomers and will help them provided their work indicates they have at least read this document. Some have improved over time, to the point of being totally trusted and gaining commit access so they don't need to depend on anyone to pick their code. On the opposite, those who insist not making minimal efforts however will simply be ignored. Background ---------- HAProxy is a community-driven project. But like most highly technical projects it takes a lot of time to develop the skills necessary to be autonomous in the project, and there is a very small core team helped by a small set of very active participants. While most of the core team members work on the code as part of their day job, most participants do it on a voluntary basis during their spare time. The ideal model for developers is to spend their time: 1) developing new features 2) fixing bugs 3) doing maintenance backports 4) reviewing other people's code It turns out that on a project like HAProxy, like many other similarly complex projects, the time spent is exactly the opposite: 1) reviewing other people's code 2) doing maintenance backports 3) fixing bugs 4) developing new features A large part of the time spent reviewing code often consists in giving basic recommendations that are already explained in this file. In addition to taking time, it is not appealing for people willing to spend one hour helping others to do the same thing over and over instead of discussing the code design, and it tends to delay the start of code reviews. Regarding backports, they are necessary to provide a set of stable branches that are deployed in production at many places. Load balancers are complex and new features often induce undesired side effects in other areas, which we will call bugs. Thus it's common for users to stick to a branch featuring everything they need and not to upgrade too often. This backporting job is critical to the ecosystem's health and must be done regularly. Very often the person devoting some time on backports has little to no information about the relevance (let alone importance) of a patch and is unlikely to be an expert in the area affected by the patch. It's the role of the commit message to explain WHAT problem the patch tries to solve, WHY it is estimated that it is a problem, and HOW it tries to address it. With these elements, the person in charge of the backports can decide whether or not to pick the patch. And if the patch does not apply (which is common for older versions) they have information in the commit message about the principle and choices that the initial developer made and will try to adapt the patch sticking to these principles. Thus, the time spent backporting patches solely depends on the code quality and the commit message details and accuracy. When it turns to fixing bugs, before declaring a bug, there is an analysis phase. It starts with "is this behaviour expected", "is it normal", "under what circumstances does it happen", "when did it start to happen", "was it intended", "was it just overlooked", and "how to fix it without breaking the initial intent". A utility called "git bisect" is usually involved in determining when the behaviour started to happen. It determines the first patch which introduced the new behaviour. If the patch is huge, touches many areas, is really difficult to read because it needlessly reindents code or adds/removes line breaks out of context, it will be very difficult to figure what part of this patch broke the behaviour. Then once the part is figured, if the commit message doesn't provide a detailed description about the intent of the patch, i.e. the problem it was trying to solve, why and how, the developer landing on that patch will really feel powerless. And very often in this case, the fix for the problem will break something else or something that depended on the original patch. But contrary to what it could look like, providing great quality patches is not difficult, and developers will always help contributors improve their patches quality because it's in their interest as well. History has shown that first time contributors can provide an excellent work when they have carefully read this document, and that people coming from projects with different practices can grow from first-time contributor to trusted committer in about 6 months. Preparation ----------- It is possible that you'll want to add a specific feature to satisfy your needs or one of your customers'. Contributions are welcome, however maintainers are often very picky about changes. Patches that change massive parts of the code, or that touch the core parts without any good reason will generally be rejected if those changes have not been discussed first. The proper place to discuss your changes is the HAProxy Mailing List. There are enough skilled readers to catch hazardous mistakes and to suggest improvements. There is no other place where you'll find as many skilled people on the project, and these people can help you get your code integrated quickly. You can subscribe to it by sending an empty e-mail at the following address : haproxy+subscribe@formilux.org It is not even necessary to subscribe, you can post there and verify via the public list archives that your message was properly delivered. In this case you should indicate in your message that you'd like responders to keep you CCed. Please visit http://haproxy.org/ to figure available options to join the list. If you have an idea about something to implement, *please* discuss it on the list first. It has already happened several times that two persons did the same thing simultaneously. This is a waste of time for both of them. It's also very common to see some changes rejected because they're done in a way that will conflict with future evolutions, or that does not leave a good feeling. It's always unpleasant for the person who did the work, and it is unpleasant in general because people's time and efforts are valuable and would be better spent working on something else. That would not happen if these were discussed first. There is no problem posting work in progress to the list, it happens quite often in fact. Just prefix your mail subject with "RFC" (it stands for "request for comments") and everyone will understand you'd like some opinion on your work in progress. Also, don't waste your time with the doc when submitting patches for review, only add the doc with the patch you consider ready to merge (unless you need some help on the doc itself, of course). Another important point concerns code portability. HAProxy requires gcc as the C compiler, and may or may not work with other compilers. However it's known to build using gcc 2.95 or any later version. As such, it is important to keep in mind that certain facilities offered by recent versions must not be used in the code: - declarations mixed in the code (requires gcc >= 3.x and is a bad practice) - GCC builtins without checking for their availability based on version and architecture ; - assembly code without any alternate portable form for other platforms - use of stdbool.h, "bool", "false", "true" : simply use "int", "0", "1" - in general, anything which requires C99 (such as declaring variables in "for" statements) Since most of these restrictions are just a matter of coding style, it is normally not a problem to comply. Please read doc/coding-style.txt for all the details. When modifying some optional subsystem (SSL, Lua, compression, device detection engines), please make sure the code continues to build (and to work) when these features are disabled. Similarly, when modifying the SSL stack, please always ensure that supported OpenSSL versions continue to build and to work, especially if you modify support for alternate libraries. Clean support for the legacy OpenSSL libraries is mandatory, support for its derivatives is a bonus and may occasionally break eventhough a great care is taken. In other words, if you provide a patch for OpenSSL you don't need to test its derivatives, but if you provide a patch for a derivative you also need to test with OpenSSL. If your work is very confidential and you can't publicly discuss it, you can also mail willy@haproxy.org directly about it, but your mail may be waiting several days in the queue before you get a response, if you get a response at all. Retransmit if you don't get a response by one week. Please note that direct sent e-mails to this address for non-confidential subjects may simply be forwarded to the list or be deleted without notification. An auto-responder bot is in place to try to detect e-mails from people asking for help and to redirect them to the mailing list. Do not be surprised if this happens to you. If you'd like a feature to be added but you think you don't have the skills to implement it yourself, you should follow these steps : 1. discuss the feature on the mailing list. It is possible that someone else has already implemented it, or that someone will tell you how to proceed without it, or even why not to do it. It is also possible that in fact it's quite easy to implement and people will guide you through the process. That way you'll finally have YOUR patch merged, providing the feature YOU need. 2. if you really can't code it yourself after discussing it, then you may consider contacting someone to do the job for you. Some people on the list might sometimes be OK with trying to do it. The version control system used by the project (Git) keeps authorship information in the form of the patch author's e-mail address. This way you will be credited for your work in the project's history. If you contract with someone to implement your idea you may have to discuss such modalities with the person doing the work as by default this person will be mentioned as the work's author. Rules: the 12 laws of patch contribution ---------------------------------------- People contributing patches must apply the following rules. That may sound heavy at the beginning but it's common sense more than anything else and contributors do not think about them anymore after a few patches. 1) Comply with the license Before modifying some code, you have read the LICENSE file ("main license") coming with the sources, and all the files this file references. Certain files may be covered by different licenses, in which case it will be indicated in the files themselves. In any case, you agree to respect these licenses and to contribute your changes under the same licenses. If you want to create new files, they will be under the main license, or any license of your choice that you have verified to be compatible with the main license, and that will be explicitly mentioned in the affected files. The project's maintainers are free to reject contributions proposing license changes they feel are not appropriate or could cause future trouble. 2) Develop on development branch, not stable ones Your work may only be based on the latest development version. No development is made on a stable branch. If your work needs to be applied to a stable branch, it will first be applied to the development branch and only then will be backported to the stable branch. You are responsible for ensuring that your work correctly applies to the development version. If at any moment you are going to work on restructuring something important which may impact other contributors, the rule that applies is that the first sent is the first served. However it is considered good practice and politeness to warn others in advance if you know you're going to make changes that may force them to re-adapt their code, because they did probably not expect to have to spend more time discovering your changes and rebasing their work. 3) Read and respect the coding style You have read and understood "doc/coding-style.txt", and you're actively determined to respect it and to enforce it on your coworkers if you're going to submit a team's work. We don't care what text editor you use, whether it's an hex editor, cat, vi, emacs, Notepad, Word, or even Eclipse. The editor is only the interface between you and the text file. What matters is what is in the text file in the end. The editor is not an excuse for submitting poorly indented code, which only proves that the person has no consideration for quality and/or has done it in a hurry (probably worse). Please note that most bugs were found in low-quality code. Reviewers know this and tend to be much more reluctant to accept poorly formatted code because by experience they won't trust their author's ability to write correct code. It is also worth noting that poor quality code is painful to read and may result in nobody willing to waste their time even reviewing your work. 4) Present clean work The time it takes for you to polish your code is always much smaller than the time it takes others to do it for you, because they always have to wonder if what they see is intended (meaning they didn't understand something) or if it is a mistake that needs to be fixed. And since there are less reviewers than submitters, it is vital to spread the effort closer to where the code is written and not closer to where it gets merged. For example if you have to write a report for a customer that your boss wants to review before you send it to the customer, will you throw on his desk a pile of paper with stains, typos and copy-pastes everywhere ? Will you say "come on, OK I made a mistake in the company's name but they will find it by themselves, it's obvious it comes from us" ? No. When in doubt, simply ask for help on the mailing list. 5) Documentation is very important There are four levels of importance of quality in the project : - The most important one, and by far, is the quality of the user-facing documentation. This is the first contact for most users and it immediately gives them an accurate idea of how the project is maintained. Dirty docs necessarily belong to a dirty project. Be careful to the way the text you add is presented and indented. Be very careful about typos, usual mistakes such as double consonants when only one is needed or "it's" instead of "its", don't mix US English and UK English in the same paragraph, etc. When in doubt, check in a dictionary. Fixes for existing typos in the doc are always welcome and chasing them is a good way to become familiar with the project and to get other participants' respect and consideration. - The second most important level is user-facing messages emitted by the code. You must try to see all the messages your code produces to ensure they are understandable outside of the context where you wrote them, because the user often doesn't expect them. That's true for warnings, and that's even more important for errors which prevent the program from working and which require an immediate and well understood fix in the configuration. It's much better to say "line 35: compression level must be an integer between 1 and 9" than "invalid argument at line 35". In HAProxy, error handling roughly represents half of the code, and that's about 3/4 of the configuration parser. Take the time to do something you're proud of. A good rule of thumb is to keep in mind that your code talks to a human and tries to teach them how to proceed. It must then speak like a human. - The third most important level is the code and its accompanying comments, including the commit message which is a complement to your code and comments. It's important for all other contributors that the code is readable, fluid, understandable and that the commit message describes what was done, the choices made, the possible alternatives you thought about, the reason for picking this one and its limits if any. Comments should be written where it's easy to have a doubt or after some error cases have been wiped out and you want to explain what possibilities remain. All functions must have a comment indicating what they take on input and what they provide on output. Please adjust the comments when you copy-paste a function or change its prototype, this type of lazy mistake is too common and very confusing when reading code later to debug an issue. Do not forget that others will feel really angry at you when they have to dig into your code for a bug that your code caused and they feel like this code is dirty or confusing, that the commit message doesn't explain anything useful and that the patch should never have been accepted in the first place. That will strongly impact your reputation and will definitely affect your chances to contribute again! - The fourth level of importance is in the technical documentation that you may want to add with your code. Technical documentation is always welcome as it helps others make the best use of your work and to go exactly in the direction you thought about during the design. This is also what reduces the risk that your design gets changed in the near future due to a misuse and/or a poor understanding. All such documentation is actually considered as a bonus. It is more important that this documentation exists than that it looks clean. Sometimes just copy-pasting your draft notes in a file to keep a record of design ideas is better than losing them. Please do your best so that other ones can read your doc. If these docs require a special tool such as a graphics utility, ensure that the file name makes it unambiguous how to process it. So there are no rules here for the contents, except one. Please write the date in your file. Design docs tend to stay forever and to remain long after they become obsolete. At this point that can cause harm more than it can help. Writing the date in the document helps developers guess the degree of validity and/or compare them with the date of certain commits touching the same area. 6) US-ASCII only! All text files and commit messages are written using the US-ASCII charset. Please be careful that your contributions do not contain any character not printable using this charset, as they will render differently in different editors and/or terminals. Avoid latin1 and more importantly UTF-8 which some editors tend to abuse to replace some US-ASCII characters with their typographic equivalent which aren't readable anymore in other editors. The only place where alternative charsets are tolerated is in your name in the commit message, but it's at your own risk as it can be mangled during the merge. Anyway if you have an e-mail address, you probably have a valid US-ASCII representation for it as well. 7) Comments Be careful about comments when you move code around. It's not acceptable that a block of code is moved to another place leaving irrelevant comments at the old place, just like it's not acceptable that a function is duplicated without the comments being adjusted. The example below started to become quite common during the 1.6 cycle, it is not acceptable and wastes everyone's time : /* Parse switching to build rule . Returns 0 on error. */ int parse_switching_rule(const char *str, struct rule *rule) { ... } /* Parse switching to build rule . Returns 0 on error. */ void execute_switching_rule(struct rule *rule) { ... } This patch is not acceptable either (and it's unfortunately not that rare) : + if (!session || !arg || list_is_empty(&session->rules->head)) + return 0; + /* Check if session->rules is valid before dereferencing it */ if (!session->rules_allocated) return 0; - if (!arg || list_is_empty(&session->rules->head)) - return 0; - 8) Short, readable identifiers Limit the length of your identifiers in the code. When your identifiers start to sound like sentences, it's very hard for the reader to keep on track with what operation they are observing. Also long names force expressions to fit on several lines which also cause some difficulties to the reader. See the example below : int file_name_len_including_global_path; int file_name_len_without_global_path; int global_path_len_or_zero_if_default; if (global_path) global_path_len_or_zero_if_default = strlen(global_path); else global_path_len_or_zero_if_default = 0; file_name_len_without_global_path = strlen(file_name); file_name_len_including_global_path = file_name_len_without_global_path + 1 + /* for '/' */ global_path_len_or_zero_if_default ? global_path_len_or_zero_if_default : default_path_len; Compare it to this one : int f, p; p = global_path ? strlen(global_path) : default_path_len; f = p + 1 + strlen(file_name); /* 1 for '/' */ A good rule of thumb is that if your identifiers start to contain more than 3 words or more than 15 characters, they can become confusing. For function names it's less important especially if these functions are rarely used or are used in a complex context where it is important to differentiate between their multiple variants. 9) Unified diff only The best way to build your patches is to use "git format-patch". This means that you have committed your patch to a local branch, with an appropriate subject line and a useful commit message explaining what the patch attempts to do. It is not strictly required to use git, but what is strictly required is to have all these elements in the same mail, easily distinguishable, and a patch in "diff -up" format (which is also the format used by Git). This means the "unified" diff format must be used exclusively, and with the function name printed in the diff header of each block. That significantly helps during reviews. Keep in mind that most reviews are done on the patch and not on the code after applying the patch. Your diff must keep some context (3 lines above and 3 lines below) so that there's no doubt where the code has to be applied. Don't change code outside of the context of your patch (eg: take care of not adding/removing empty lines once you remove your debugging code). If you are using Git (which is strongly recommended), always use "git show" after doing a commit to ensure it looks good, and enable syntax coloring that will automatically report in red the trailing spaces or tabs that your patch added to the code and that must absolutely be removed. These ones cause a real pain to apply patches later because they mangle the context in an invisible way. Such patches with trailing spaces at end of lines will be rejected. 10) One patch per feature Please cut your work in series of patches that can be independently reviewed and merged. Each patch must do something on its own that you can explain to someone without being ashamed of what you did. For example, you must not say "This is the patch that implements SSL, it was tricky". There's clearly something wrong there, your patch will be huge, will definitely break things and nobody will be able to figure what exactly introduced the bug. However it's much better to say "I needed to add some fields in the session to store the SSL context so this patch does this and doesn't touch anything else, so it's safe". Also when dealing with series, you will sometimes fix a bug that one of your patches introduced. Please do merge these fixes (eg: using git rebase -i and squash or fixup), as it is not acceptable to see patches which introduce known bugs even if they're fixed later. Another benefit of cleanly splitting patches is that if some of your patches need to be reworked after a review, the other ones can still be merged so that you don't need to care about them anymore. When sending multiple patches for review, prefer to send one e-mail per patch than all patches in a single e-mail. The reason is that not everyone is skilled in all areas nor has the time to review everything at once. With one patch per e-mail, it's easy to comment on a single patch without giving an opinion on the other ones, especially if a long thread starts about one specific patch on the mailing list. "git send-email" does that for you though it requires a few trials before getting it right. If you can, please always put all the bug fixes at the beginning of the series. This often makes it easier to backport them because they will not depend on context that your other patches changed. As a hint, if you can't do this, there are little chances that your bug fix can be backported. 11) Real commit messages please! The commit message is how you're trying to convince a maintainer to adopt your work and maintain it as long as possible. A dirty commit message almost always comes with dirty code. Too short a commit message indicates that too short an analysis was done and that side effects are extremely likely to be encountered. It's the maintainer's job to decide to accept this work in its current form or not, with the known constraints. Some patches which rework architectural parts or fix sensitive bugs come with 20-30 lines of design explanations, limitations, hypothesis or even doubts, and despite this it happens when reading them 6 months later while trying to identify a bug that developers still miss some information about corner cases. So please properly format your commit messages. To get an idea, just run "git log" on the file you've just modified. Patches always have the format of an e-mail made of a subject, a description and the actual patch. If you are sending a patch as an e-mail formatted this way, it can quickly be applied with limited effort so that's acceptable : - A subject line (may wrap to the next line, but please read below) - an empty line (subject delimiter) - a non-empty description (the body of the e-mail) - the patch itself The subject describes the "What" of the change ; the description explains the "why", the "how" and sometimes "what next". For example a commit message looking like this will be rejected : | From: Mr Foobar | Subject: BUG: fix typo in ssl_sock | This one as well (too long subject, not the right place for the details) : | From: Mr Foobar | Subject: BUG/MEDIUM: ssl: use an error flag to prevent ssl_read() from | returning 0 when dealing with large buffers because that can cause | an infinite loop | This one ought to be used instead : | From: Mr Foobar | Subject: BUG/MEDIUM: ssl: fix risk of infinite loop in ssl_sock | | ssl_read() must not return 0 on error or the caller may loop forever. | Instead we add a flag to the connection to notify about the error and | check it at all call places. This situation can only happen with large | buffers so a workaround is to limit buffer sizes. Another option would | have been to return -1 but it required to use signed ints everywhere | and would have made the patch larger and riskier. This fix should be | backported to versions 1.2 and upper. It is important to understand that for any reader to guess the text above when it's absent, it will take a huge amount of time. If you made the analysis leading to your patch, you must explain it, including the ideas you dropped if you had a good reason for this. While it's not strictly required to use Git, it is strongly recommended because it helps you do the cleanest job with the least effort. But if you are comfortable with writing clean e-mails and inserting your patches, you don't need to use Git. But in any case, it is important that there is a clean description of what the patch does, the motivation for what it does, why it's the best way to do it, its impacts, and what it does not yet cover. And this is particularly important for bugs. A patch tagged "BUG" must absolutely explain what the problem is, why it is considered as a bug. Anybody, even non-developers, should be able to tell whether or not a patch is likely to address an issue they are facing. Indicating what the code will do after the fix doesn't help if it does not say what problem is encountered without the patch. Note that in some cases the bug is purely theorical and observed by reading the code. In this case it's perfectly fine to provide an estimate about possible effects. Also, in HAProxy, like many projects which take a great care of maintaining stable branches, patches are reviewed later so that some of them can be backported to stable releases. While reviewing hundreds of patches can seem cumbersome, with a proper formatting of the subject line it actually becomes very easy. For example, here's how one can find patches that need to be reviewed for backports (bugs and doc) between since commit ID 827752e : $ git log --oneline 827752e.. | grep 'BUG\|DOC' 0d79cf6 DOC: fix function name bc96534 DOC: ssl: missing LF 10ec214 BUG/MEDIUM: lua: the lua function Channel:close() causes a segf bdc97a8 BUG/MEDIUM: lua: outgoing connection was broken since 1.6-dev2 ba56d9c DOC: mention support for RFC 5077 TLS Ticket extension in start f1650a8 DOC: clarify some points about SSL and the proxy protocol b157d73 BUG/MAJOR: peers: fix current table pointer not re-initialized e1ab808 BUG/MEDIUM: peers: fix wrong message id on stick table updates cc79b00 BUG/MINOR: ssl: TLS Ticket Key rotation broken via socket comma d8e42b6 DOC: add new file intro.txt c7d7607 BUG/MEDIUM: lua: bad error processing 386a127 DOC: match several lua configuration option names to those impl 0f4eadd BUG/MEDIUM: counters: ensure that src_{inc,clr}_gpc0 creates a It is made possible by the fact that subject lines are properly formatted and always respect the same principle : one part indicating the nature and severity of the patch, another one to indicate which subsystem is affected, and the last one is a succinct description of the change, with the important part at the beginning so that it's obvious what it does even when lines are truncated like above. The whole stable maintenance process relies on this. For this reason, it is mandatory to respect some easy rules regarding the way the subject is built. Please see the section below for more information regarding this formatting. As a rule of thumb, your patch MUST NEVER be made only of a subject line, it *must* contain a description. Even one or two lines, or indicating whether a backport is desired or not. It turns out that single-line commits are so rare in the Git world that they require special manual (hence painful) handling when they are backported, and at least for this reason it's important to keep this in mind. Maintainers who pick your patch may slightly adjust the description as they see fit. Do not see this as a failure to do a clean job, it just means they think it will help them do their daily job this way. The code may also be slightly adjusted before being merged (non-functional changes only, fix for typos, tabs vs spaces for example), unless your patch contains a Signed-off-By tag, in which case they will either modify it and mention the changes after your Signed-off-By line, or (more likely) ask you to perform these changes yourself. This ability to slightly adjust a patch before merging is is the main reason for not using pull requests which do not provide this facility and will require to iterate back and forth with the submitter and significantly delay the patch inclusion. Each patch fixing a bug MUST be tagged with "BUG", a severity level, an indication of the affected subsystem and a brief description of the nature of the issue in the subject line, and a detailed analysis in the message body. The explanation of the user-visible impact and the need for backporting to stable branches or not are MANDATORY. Bug fixes with no indication will simply be rejected as they are very likely to cause more harm when nobody is able to tell whether or not the patch needs to be backported or can be reverted in case of regression. When fixing a bug which is reproducible, if possible, the contributors are strongly encouraged to write a regression testing VTC file for varnishtest to add to reg-tests directory. More information about varnishtest may be found in README file of reg-tests directory and in doc/regression-testing.txt file. 12) Discuss on the mailing list Note, some first-time contributors might feel impressed or scared by posting to a list. This list is frequented only by nice people who are willing to help you polish your work so that it is perfect and can last long. What you think could be perceived as a proof of incompetence or lack of care will instead be a proof of your ability to work with a community. You will not be judged nor blamed for making mistakes. The project maintainers are the ones creating the most bugs and mistakes anyway, and nobody knows the project in its entirety anymore so you're just like anyone else. And people who have no consideration for other's work are quickly ejected from the list so the place is as safe and welcoming to new contributors as it is to long time ones. When submitting changes, please always CC the mailing list address so that everyone gets a chance to spot any issue in your code. It will also serve as an advertisement for your work, you'll get more testers quicker and you'll feel better knowing that people really use your work. It's often convenient to prepend "[PATCH]" in front of your mail's subject to mention that this e-mail contains a patch (or a series of patches), because it will easily catch reviewer's attention. It's automatically done by tools such as "git format-patch" and "git send-email". If you don't want your patch to be merged yet and prefer to show it for discussion, better tag it as "[RFC]" (stands for "Request For Comments") and it will be reviewed but not merged without your approval. It is also important to CC any author mentioned in the file you change, or a subsystem maintainers whose address is mentioned in a MAINTAINERS file. Not everyone reads the list on a daily basis so it's very easy to miss some changes. Don't consider it as a failure when a reviewer tells you you have to modify your patch, actually it's a success because now you know what is missing for your work to get accepted. That's why you should not hesitate to CC enough people. Don't copy people who have no deal with your work area just because you found their address on the list. That's the best way to appear careless about their time and make them reject your changes in the future. Patch classifying rules ----------------------- There are 3 criteria of particular importance in any patch : - its nature (is it a fix for a bug, a new feature, an optimization, ...) - its importance, which generally reflects the risk of merging/not merging it - what area it applies to (eg: http, stats, startup, config, doc, ...) It's important to make these 3 criteria easy to spot in the patch's subject, because it's the first (and sometimes the only) thing which is read when reviewing patches to find which ones need to be backported to older versions. It also helps when trying to find which patch is the most likely to have caused a regression. Specifically, bugs must be clearly easy to spot so that they're never missed. Any patch fixing a bug must have the "BUG" tag in its subject. Most common patch types include : - BUG fix for a bug. The severity of the bug should also be indicated when known. Similarly, if a backport is needed to older versions, it should be indicated on the last line of the commit message. The commit message MUST ABSOLUTELY describe the problem and its impact to non-developers. Any user must be able to guess if this patch is likely to fix a problem they are facing. Even if the bug was discovered by accident while reading the code or running an automated tool, it is mandatory to try to estimate what potential issue it might cause and under what circumstances. There may even be security implications sometimes so a minimum analysis is really required. Also please think about stable maintainers who have to build the release notes, they need to have enough input about the bug's impact to explain it. If the bug has been identified as a regression brought by a specific patch or version, this indication will be appreciated too. New maintenance releases are generally emitted when a few of these patches are merged. If the bug is a vulnerability for which a CVE identifier was assigned before you publish the fix, you can mention it in the commit message, it will help distro maintainers. - CLEANUP code cleanup, silence of warnings, etc... theoretically no impact. These patches will rarely be seen in stable branches, though they may appear when they remove some annoyance or when they make backporting easier. By nature, a cleanup is always of minor importance and it's not needed to mention it. - DOC updates to any of the documentation files, including README. Many documentation updates are backported since they don't impact the product's stability and may help users avoid bugs. So please indicate in the commit message if a backport is desired. When a feature gets documented, it's preferred that the doc patch appears in the same patch or after the feature patch, but not before, as it becomes confusing when someone working on a code base including only the doc patch won't understand why a documented feature does not work as documented. - REORG code reorganization. Some blocks may be moved to other places, some important checks might be swapped, etc... These changes always present a risk of regression. For this reason, they should never be mixed with any bug fix nor functional change. Code is only moved as-is. Indicating the risk of breakage is highly recommended. Minor breakage is tolerated in such patches if trying to fix it at once makes the whole change even more confusing. That may happen for example when some #ifdefs need to be propagated in every file consecutive to the change. - BUILD updates or fixes for build issues. Changes to makefiles also fall into this category. The risk of breakage should be indicated if known. It is also appreciated to indicate what platforms and/or configurations were tested after the change. - OPTIM some code was optimised. Sometimes if the regression risk is very low and the gains significant, such patches may be merged in the stable branch. Depending on the amount of code changed or replaced and the level of trust the author has in the change, the risk of regression should be indicated. If the optimization depends on the architecture or on build options, it is important to verify that the code continues to work without it. - RELEASE release of a new version (development or stable). - LICENSE licensing updates (may impact distro packagers). - REGTEST updates to any of the regression testing files found in reg-tests directory, including README or any documentation file. When the patch cannot be categorized, it's best not to put any type tag, and to only use a risk or complexity information only as below. This is commonly the case for new features, which development versions are mostly made of. The importance, complexity of the patch, or severity of the bug it fixes must be indicated when relevant. A single upper-case word is preferred, among : - MINOR minor change, very low risk of impact. It is often the case for code additions that don't touch live code. As a rule of thumb, a patch tagged "MINOR" is safe enough to be backported to stable branches. For a bug, it generally indicates an annoyance, nothing more. - MEDIUM medium risk, may cause unexpected regressions of low importance or which may quickly be discovered. In short, the patch is safe but touches working areas and it is always possible that you missed something you didn't know existed (eg: adding a "case" entry or an error message after adding an error code to an enum). For a bug, it generally indicates something odd which requires changing the configuration in an undesired way to work around the issue. - MAJOR major risk of hidden regression. This happens when large parts of the code are rearranged, when new timeouts are introduced, when sensitive parts of the session scheduling are touched, etc... We should only exceptionally find such patches in stable branches when there is no other option to fix a design issue. For a bug, it indicates severe reliability issues for which workarounds are identified with or without performance impacts. - CRITICAL medium-term reliability or security is at risk and workarounds, if they exist, might not always be acceptable. An upgrade is absolutely required. A maintenance release may be emitted even if only one of these bugs are fixed. Note that this tag is only used with bugs. Such patches must indicate what is the first version affected, and if known, the commit ID which introduced the issue. The expected length of the commit message grows with the importance of the change. While a MINOR patch may sometimes be described in 1 or 2 lines, MAJOR or CRITICAL patches cannot have less than 10-15 lines to describe exactly the impacts otherwise the submitter's work will be considered as rough sabotage. If you are sending a new patch series after a review, it is generally good to enumerate at the end of the commit description what changed from the previous one as it helps reviewers quickly glance over such changes and not re-read the rest. For BUILD, DOC and CLEANUP types, this tag is not always relevant and may be omitted. The area the patch applies to is quite important, because some areas are known to be similar in older versions, suggesting a backport might be desirable, and conversely, some areas are known to be specific to one version. The area is a single-word lowercase name the contributor find clear enough to describe what part is being touched. The following list of tags is suggested but not exhaustive: - examples example files. Be careful, sometimes these files are packaged. - tests regression test files. No code is affected, no need to upgrade. - reg-tests regression test files for varnishtest. No code is affected, no need to upgrade. - init initialization code, arguments parsing, etc... - config configuration parser, mostly used when adding new config keywords - http the HTTP engine - stats the stats reporting engine - cli the stats socket CLI - checks the health checks engine (eg: when adding new checks) - sample the sample fetch system (new fetch or converter functions) - acl the ACL processing core or some ACLs from other areas - filters everything related to the filters core - peers the peer synchronization engine - lua the Lua scripting engine - listeners everything related to incoming connection settings - frontend everything related to incoming connection processing - backend everything related to LB algorithms and server farm - session session processing and flags (very sensible, be careful) - server server connection management, queueing - spoe SPOE code - ssl the SSL/TLS interface - proxy proxy maintenance (start/stop) - log log management - poll any of the pollers - halog the halog sub-component in the contrib directory - contrib any addition to the contrib directory - htx general HTX subsystem - mux-h1 HTTP/1.x multiplexer/demultiplexer - mux-h2 HTTP/2 multiplexer/demultiplexer - h1 general HTTP/1.x protocol parser - h2 general HTTP/2 protocol parser Other names may be invented when more precise indications are meaningful, for instance : "cookie" which indicates cookie processing in the HTTP core. Last, indicating the name of the affected file is also a good way to quickly spot changes. Many commits were already tagged with "stream_sock" or "cfgparse" for instance. It is required that the type of change and the severity when relevant are indicated, as well as the touched area when relevant as well in the patch subject. Normally, we would have the 3 most often. The two first criteria should be present before a first colon (':'). If both are present, then they should be delimited with a slash ('/'). The 3rd criterion (area) should appear next, also followed by a colon. Thus, all of the following subject lines are valid : Examples of subject lines : - DOC: document options forwardfor to logasap - DOC/MAJOR: reorganize the whole document and change indenting - BUG: stats: connection reset counters must be plain ascii, not HTML - BUG/MINOR: stats: connection reset counters must be plain ascii, not HTML - MEDIUM: checks: support multi-packet health check responses - RELEASE: Released version 1.4.2 - BUILD: stats: stdint is not present on solaris - OPTIM/MINOR: halog: make fgets parse more bytes by blocks - REORG/MEDIUM: move syscall redefinition to specific places Please do not use square brackets anymore around the tags, because they induce more work when merging patches, which need to be hand-edited not to lose the enclosed part. In fact, one of the only square bracket tags that still makes sense is '[RFC]' at the beginning of the subject, when you're asking for someone to review your change before getting it merged. If the patch is OK to be merged, then it can be merge as-is and the '[RFC]' tag will automatically be removed. If you don't want it to be merged at all, you can simply state it in the message, or use an alternate 'WIP/' prefix in front of your tag tag ("work in progress"). The tags are not rigid, follow your intuition first, and they may be readjusted when your patch is merged. It may happen that a same patch has a different tag in two distinct branches. The reason is that a bug in one branch may just be a cleanup or safety measure in the other one because the code cannot be triggered. Working with Git ---------------- For a more efficient interaction between the mainline code and your code, you are strongly encouraged to try the Git version control system : http://git-scm.com/ It's very fast, lightweight and lets you undo/redo your work as often as you want, without making your mistakes visible to the rest of the world. It will definitely help you contribute quality code and take other people's feedback in consideration. In order to clone the HAProxy Git repository : $ git clone http://git.haproxy.org/git/haproxy.git/ (development) If you decide to use Git for your developments, then your commit messages will have the subject line in the format described above, then the whole description of your work (mainly why you did it) will be in the body. You can directly send your commits to the mailing list, the format is convenient to read and process. It is recommended to create a branch for your work that is based on the master branch : $ git checkout -b 20150920-fix-stats master You can then do your work and even experiment with multiple alternatives if you are not completely sure that your solution is the best one : $ git checkout -b 20150920-fix-stats-v2 Then reorder/merge/edit your patches : $ git rebase -i master When you think you're ready, reread your whole patchset to ensure there is no formatting or style issue : $ git show master.. And once you're satisfied, you should update your master branch to be sure that nothing changed during your work (only needed if you left it unattended for days or weeks) : $ git checkout -b 20150920-fix-stats-rebased $ git fetch origin master:master $ git rebase master You can build a list of patches ready for submission like this : $ git format-patch master The output files are the patches ready to be sent over e-mail, either via a regular e-mail or via git send-email (carefully check the man page). Don't destroy your other work branches until your patches get merged, it may happen that earlier designs will be preferred for various reasons. Patches should be sent to the mailing list : haproxy@formilux.org and CCed to relevant subsystem maintainers or authors of the modified files if their address appears at the top of the file. Please don't send pull requests, they are really inconvenient as they make it much more complicate to perform minor adjustments, and nobody benefits from any comment on the code while on a list all subscribers learn a little bit on each review of anyone else's code. What to do if your patch is ignored ----------------------------------- All patches merged are acknowledged by the maintainer who picked it. If you didn't get an acknowledgement, check the mailing list archives to see if your mail was properly delivered there and possibly if anyone responded and you did not get their response (please look at http://haproxy.org/ for the mailing list archive's address). If you see that your mail is there but nobody responded, please recheck: - was the subject clearly indicating that it was a patch and/or that you were seeking some review? - was your email mangled by your mail agent? If so it's possible that nobody had the willingness yet to mention it. - was your email sent as HTML? If so it definitely ended in spam boxes regardless of the archives. - did the patch violate some of the principles explained in this document? If none of these cases matches, it might simply be that everyone was busy when your patch was sent and that it was overlooked. In this case it's fine to either resubmit it or respond to your own email asking if anything's wrong about it. In general don't expect a response after one week of silence, just because your email will not appear in anyone else's current window. So after one week it's time to resubmit. Among the mistakes that tend to make reviewers not respond are those who send multiple versions of a patch in a row. It's natural for others then to wait for the series to stabilize. And once it doesn't move anymore everyone forgot about it. As a rule of thumb, if you have to update your original email more than twice, first double-check that your series is really ready for submission, and second, start a new thread and stop responding to the previous one. In this case it is well appreciated to mention a version of your patch set in the subject such as "[PATCH v2]", so that reviewers can immediately spot the new version and not waste their time on the old one. If you still do not receive any response, it is possible that you've already played your last card by not respecting the basic principles multiple times despite being told about it several times, and that nobody is willing to spend more of their time than normally needed with your work anymore. Your best option at this point probably is to ask "did I do something wrong" than to resend the same patches. How to be sure to irritate everyone ----------------------------------- Among the best ways to quickly lose everyone's respect, there is this small selection, which should help you improve the way you work with others, if you notice you're already practising some of them: - repeatedly send improperly formatted commit messages, with no type or severity, or with no commit message body. These ones require manual edition, maintainers will quickly learn to recognize your name. - repeatedly send patches which break something, and disappear or take a long time to provide a fix. - fail to respond to questions related to features you have contributed in the past, which can further lead to the feature being declared unmaintained and removed in a future version. - send a new patch iteration without taking *all* comments from previous review into consideration, so that the reviewer discovers they have to do the exact same work again. - "hijack" an existing thread to discuss something different or promote your work. This will generally make you look like a fool so that everyone wants to stay away from your e-mails. - continue to send pull requests after having been explained why they are not welcome. - give wrong advices to people asking for help, or sending them patches to try which make no sense, waste their time, and give them a bad impression of the people working on the project. - be disrespectful to anyone asking for help or contributing some work. This may actually even get you kicked out of the list and banned from it. -- end