Rulesets and Rewrite Rules

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− etcsendmail.cf continued −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Local and Program Mailer specification mandatory Mlocal, P=binmail, F=rlsDFMmnP, S=10, R=20, A=mail −d u Mprog, P=binsh, F=lsDFMeuP, S=10, R=20, A=sh −c u Ethernet Mailer specification Messages processed by this configuration are assumed to remain in the same domain. This really has nothing particular to do with Ethernet − the name is historical. Mether, P=[TCP], F=msDFMuCX, S=11, R=21, A=TCP h UUCP Mailer specification Muucp, P=usrbinuux, F=msDFMhuU, S=13, R=23, A=uux − −r −af hrmail u −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− to be continued −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Two mailers, local and prog, are mandatory for every sendmail configuration file. In this case these are the program binmail and Bourne shell binsh. Two other defined mailers are: ether for sendmail communication through the network specified by [TCP], and uucp program usrbinuux for UUCP delivery via phone line.

20.2.2 Rulesets and Rewrite Rules

Rulesets define rules for how to transform e−mail addresses into the format suitable for e−mail delivery. A leading uppercase letter S and the ruleset number identify them. Newer sendmail versions also allow the textual identification of a ruleset; this could make it easier to determine the purpose of the ruleset the name usually describes the basic function of the ruleset. A ruleset includes one or more rewriting rules, which are individual lines entries that define a specific address transformation; an empty ruleset is also allowed. A leading uppercase letter R identifies rewriting rule entries. The end of a ruleset is defined by the beginning of the next ruleset, or any other configuration entry except a rewrite rule entry. An input to the ruleset is an address to be parsed, and the output is the parsed input address. A ruleset is called by sendmail directly, or by another ruleset. Rewriting rule entries within a ruleset are processed sequentially. When empty a ruleset without any rewriting rule entry, it preserves an address unchanged the input and output addresses are equal. Let us see what rulesets look like: −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− etcsendmail.cf continued −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Rewriting rules Sender Field Pre−rewriting S1 an empty ruleset None needed. Recipient Field Pre−rewriting S2 an empty ruleset None needed. Name Canonicalization Internal format of names within the rewriting rules is: anythinghost.domain.domain… anything We try to get every kind of name into this format, except for local 478 = = = = = = = = = = here is the beginning of Ruleset 3 S3 handle from: special case R turn into magic token basic textual canonicalization R+ 2 basic RFC822 parsing make sure a,b,c:userd syntax is easy to parse −− undone later R+,+:+ 1:2:3 change all , to : R+:+ 61:2 src route canonical R+:;+ 1:2;3 list syntax R++ :12 focus on domain R+++ 123 move gaze right R++ 612 already canonical convert old−style names to domain−based names All old−style names parse from left to right, without precedence. R−+ 621.uucp uucphostuser R−.++ 631.2 host.domainuser R++ 312 userhost = = = = = = = = = = here is the end of Ruleset 3 Final Output Post−rewriting S4 R++.uucp 21 uh.uucp = hu R+ : 9 1 Clean up addr R+ 123 defocus Rewriting rules A number of rewrite rules follow, but they are not presented here. −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− to be continued −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− The specified rulesets and rewrite rules are the only ones that sendmail knows about and follows. They must be sufficient for a complete and correct e−mail processing. It is not very common to modify this part, although a deeper sendmail customization is usually related to this section. This is also the most probable place to look if problems in the e−mail delivery are encountered. If a modification in this section is unavoidable, it must be done extremely carefully. 20.2.2.1 The Ruleset Sequence We already mentioned that a ruleset could be invoked from another ruleset, or by sendmail directly. A direct ruleset invocation is the result of the coded ruleset sequence in the sendmail program. By default all e−mail addresses follow the same ruleset path during their parsing. At the beginning, this path was uniform for all e−mail addresses. However, based on the accumulated experience of the long−time usage of sendmail and increased security demands, the default ruleset sequence has been modified and improved lately. Since the sendmail version 8, separate default ruleset paths have been introduced in parsing envelope and header e−mail addresses. Simply, the envelope and 479 This is presented in Figure 20.4. Depending of the implemented sendmail version, the flow of the addresses through the default rulesets called directly by sendmail corresponds to one of the two ruleset patterns. Figure 20.4: Sequence of rulesets. Each box marked by a number defines the numeric name of the corresponding ruleset. • S = box stands for a ruleset whose numeric name is defined by the S field in the mailer definition. Each mailer may specify its own ruleset for mailer −specific cleanup of the sender address before the message is delivered. • R = box stands for a ruleset whose numeric name is defined by the R field in the mailer definition. Each mailer may specify its own ruleset for mailer −specific cleanup of the recipient address before the message is delivered. • 480 Rulesets can be thought of as subroutines, or functions, designed to process e−mail addresses. They are called from mailer definitions, from individual rewrite rules, or directly by sendmail. Five rulesets built in sendmail have special functions: Ruleset 3 is the first ruleset to be applied to addresses. It converts an address to the canonical form: local−parthost.domain. Do not forget that for quite a time the Internet e−mail addressing concept which is completely prevailing nowadays has been only one of the implemented e−mail addressing mechanisms. In such a diverse addressing environment, it was extremely important to find some common ground. 1. Ruleset 0 is applied to the addresses used to deliver the e−mail. It is applied after ruleset 3, and only to the recipient addresses, which are actually used for e−mail delivery. It resolves the recipient address to the triple mailer−host−user. This is presented in Figure 20.5. Figure 20.5: Ruleset 0 resolves a triple: {mailer, host, user}. 2. Ruleset 1 is applied to all sender addresses in the message. Nowadays it is usually an empty ruleset. 3. Ruleset 2 is applied to all recipient addresses in the message. Nowadays it is usually an empty ruleset. 4. Ruleset 4 is applied to all addresses and is used to translate back internal address formats into the initial external address formats. 5. There are, of course, many other rulesets specified in the sendmail configuration file. These rulesets provide additional address processing and are called by existing rulesets using the n construct, or by the sendmail according to the selected mailer upon the ruleset 0 completion: the boxes S and R in Figure 20.4. Besides the listed rulesets, which are hard−coded in the sendmail program, the identification of other rulesets seems to be arbitrary. A ruleset could be named arbitrarily with numbers, or letters, or combined, and a corresponding rewrite rule modified to call the newly identified ruleset. However, there are some conventions in naming a ruleset, and it is highly recommended to stay 481 The following table lists the usual naming of rulesets for certain purposes other than those hard−coded in sendmail: Rulset Purpose 1x Mailer rules sender qualification 2x Mailer rules recipient qualification 3x Mailer rules sender header qualification 4x Mailer rules recipient header qualification 5x, 6x, 7x Mailer subroutines general 8x Reserved 90 Mailtable host stripping 96 Bottom half of ruleset 3 ruleset 6 in old sendmail 97 Hook for recursive ruleset 0 call ruleset 7 in old sendmail 98 Local part of ruleset 0 ruleset 8 in old sendmail 99 Guaranteed null for debugging Text New sendmail versions use textual ruleset naming either 20.2.2.2 The Ruleset 0 A special section in the sendmail configuration file is dedicated to the ruleset 0; this is the core ruleset for e−mail delivery. It parses the e−mail address and makes the crucial decision of where and how to deliver e−mail. To make such a decision, sendmail always applied ruleset 0 over the recipients e−mail address; the output must be either a decision about the destination and the corresponding mailer, or an error. There is even special rewrite rule syntax for ruleset 0. Ruleset 0 defines the triple mailer, host, user that specifies the mail delivery program, the recipient host, and the user−recipient. This is presented in Figure 20.5. The special transformation syntax in ruleset 0 is: mailerhost:user where: mailer Mailer name defined by the M command in the sendmail.cf file host Hostname of the host to deliver email could be different than the recipient host user Username of the recipient user on the recipient host , , and : Leading constructs for these three parts respectively There is one special variant of this syntax, also used only in ruleset 0, that passes an error message to the user: error: message where message Arbitrary text of the error message returned to the user. 482 Note: All final decision−making rewrite rules are printed in bold−italic. −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− etcsendmail.cf continued −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− RULESET ZERO PREAMBLE Ruleset 30 just calls rulesets 3 then 0. S30 R : 3 1 First canonicalize R 0 1 Then rerun ruleset 0 S0 On entry, the address has been canonicalized and focused by ruleset 3. Handle special cases….. R local :n handle form Earlier releases special−cased the [x.y.z.a] format, but SunOS 4.1 or later should handle these properly on input. now delete redundant local info R=w.LOCAL 124 thishost.LOCAL RLOCAL 1m2 host == domain gateway R=w.uucp 124 thishost.uucp R=w 124 thishost arrange for local names to be fully qualified Ry 12.LOCAL3 useretherhost For numeric spec, you cant pass spec on to receiver, since old rcvrs were not smart enough to know that [x.y.z.a] is their own name. R[+]: :9 [1]:2 Clean it up, then... R[+]: ether [1] :2 numeric internet spec R[+], ether [1] :2 numeric internet spec R[+] ether [2] :1 numeric internet spec R. 123 drop trailing dot R: 301 retry after route strip R 301 strip null trash retry Machine dependent part of ruleset zero resolve names we can handle locally R=V.uucp:+ :9 1 First clean up, then... R=V.uucp:+ uucp 1 :2 host.uucp:… R+=V.uucp uucp 2 :1 userhost.uucp optimize names of known ethernet hosts Ry.LOCAL ether 2 :123 userhost.here other non−local names will be kicked upstairs R+ :9 1 Clean up, keep R+ M R :123 usersome.where R M R :12 strangeness with Local names with are really not local R++ 3012 turn = , retry everything else is a local name R+ local :1 local names −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− the end of etcsendmail.cf −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 483 All UNIX implementations provide sendmail configuration template files for several of the most common situations; other templates can be found on the network. Usually the appropriate sendmail.cf file can be placed in operation by copying a corresponding template file and performing minimal site−specific customization. Two templates, for main and subsidiary mailhosts, are essential. The main mailhost is supposed to be a knowledgeable mail server dedicated to this business, and a subsidiary mailhost is the prevailing sendmail configuration that relies on another main mail server. Customizing a system as a subsidiary mailhost is very easy. Usually it is enough to specify only the hostname of a known main mailhost main mail server where external e−mail would be forward for further processing and delivery. Sometimes it can even be done out of the sendmail.cf file itself: by appending the alias name mailhost with the real hostname of the mail server could be done in DNS, or NIS or even etchosts file. Of course, it works only if the sendmail.cf file points to the generic entity mailhost. More sophisticated configuration changes require more knowledge and skills. A manual modification of the sendmail.cf file is always possible and doable, and it is even quite common. However, an alternative approach to generate site−specific sendmail configuration files in an easier and more comprehensive way also exists. It compiles the needed sendmail.cf file based on the specified site−specific information. All rulesets and rewrite rules that make a dominant part of the file are automatically created. The input site−specific data are specified in the files that terminate with the mc extension; again, a number of template mc files are available. What makes this approach different is the fact that these template files are small comprehensive files, and easy to modify if necessary. We will briefly discuss the required procedure. Template mc files are contained in the sendmail installation subdirectory cf, with an obvious suffix .mc. They must be run through the m4 macro processor to produce a corresponding cf configuration file. The other requirement is a preloaded description file cf.m4. Once all of the required files are in place, the following command should be executed: m4 {CFDIR}m4cf.m4 config.mc config.cf where CFDIR is the root of the cf directory config.mc is the name of the template mc file config.cf is the name of the sendmail configuration file To make everything even easier a front−end Build script that specifies all needed compilation steps is also available. Simply by typing: Build config.cf The corresponding sendmail configuration file config.cf will be created based on the config.mc file. The file name is arbitrary, but the existence of a same−name mc file is required. Let us examine a typical mc file: cat generic−solaris2.mc divert−1 Copyright c 1998 Sendmail, Inc. All rights reserved. Copyright c 1983 Eric P. Allman. All rights reserved. 484 sendmail uses the M4 macro processor to compile the configuration files. The most important thing to know is that M4 is stream−based, that is, it does not understand lines. For this reason, in some places you may see the word dnl, which stands for delete through newline; essentially, it deletes all characters starting at the dnl up to and including the next newline character. In most cases sendmail uses this only to avoid lots of unnecessary blank lines in the output. It could be also used to comment−out an mc entry. Other important directives are defineA, B which defines the macro A to have the value B. Macros are expanded as they are read, so one normally quotes both values to prevent expansion, for example: defineSMART_HOST, smart.school.edu Please note that M4 macros are expanded even in lines that appear to be comments, for example: See FEATUREfoo above This will not do what you expect, because the FEATUREfoo will be expanded. This also applies to: And then define the X macro to be the return address because define is an M4 keyword. If you want to use these words, surround them with single quotes, like this. The following partial listing of the sendmail cf subdirectory shows few of template mc files, and the corresponding sendmail configuration cf files pay attention to their sizes: ls −l optsendmailcfcf total 1548 −r−xr−xr−x 1 root other 535 Dec 29 1998 Build −r−−r−−r−− 1 root other 4163 Dec 29 1998 Makefile −r−−r−−r−− 1 root other 29824 Oct 21 1999 cs−solaris2.cf −r−−r−−r−− 1 root other 989 Dec 29 1998 cs−solaris2.mc −r−−r−−r−− 1 root other 28785 Feb 5 1999 generic−hpux10.cf −r−−r−−r−− 1 root other 763 Dec 29 1998 generic−hpux10.mc −r−−r−−r−− 1 root other 28787 Feb 5 1999 generic−solaris2.cf −r−−r−−r−− 1 root other 787 Dec 29 1998 generic−solaris2.mc ..... 485 −r−−r−−r−− 1 root other 29641 Oct 21 1999 mail.cs.cf −r−−r−−r−− 1 root other 1250 Dec 29 1998 mail.cs.mc Though the existing tools will help in managing sendmail configuration, a thorough understanding and knowledge of the contents of the sendmail.cf file is crucial for a successful sendmail administration.

20.3 The Parsing of E−mail Addresses

Rewrite rules are the core of the sendmail.cf file. Rulesets are groups of associated rewrite rules that can be referenced by a number, or lately any alphanumeric combination. In the S n command