Package Martel :: Module Time
[hide private]
[frames] | no frames]

Source Code for Module Martel.Time

  1  """Parse a time or date string. 
  2   
  3  Converts a strftime/strptime-like format string into a Martel regular 
  4  expression (either as a string or an Expression). 
  5   
  6  Example use: 
  7    >>> from Martel import Time 
  8    >>> from xml.sax import saxutils 
  9    >>> format = Time.make_expression("%(Jan)-%(day)-%(YYYY)\n") 
 10    >>> parser = format.make_parser() 
 11    >>> parser.setContentHandler(saxutils.XMLGenerator()) 
 12    >>> parser.parseString("OCT-31-2021\\n") 
 13    <?xml version="1.0" encoding="iso-8859-1"?> 
 14    <month type="short">OCT</month>-<day type="numeric">31</day>-<year type="long">2021</year> 
 15    >>> 
 16     
 17   
 18  Times and dates come up often in parsing.  It's usually pretty easily 
 19  to write a syntax for them on the fly.  For example, suppose you want 
 20  to parse a date in the format 
 21    YYYY-MM-DD 
 22   
 23  as in "1985-03-26".  One pattern for that is 
 24    "\\d{4}-\\d{2}-\\d{2}" 
 25   
 26  To get the individual fields in Martel requires group names. 
 27    "(?P<year>\\d{4})-(?P<month>\\d{2})-(?P<day>\\d{2})" 
 28   
 29  If you want some minimal verification (eg, to help make sure you 
 30  haven't accidentally swapped the day and month fields) you need to 
 31  tighten down on what values are allowed, as in 
 32    "(?P<year>\\d{4})-(?P<month>0[1-9]|1[012])-(?P<day>0[1-9]|[12][0-9]|3[01])" 
 33   
 34  The more you write, the more the likelihood of making a mistake, the 
 35  more the chance different format definitions use different patterns, 
 36  the harder it is to understand what's going on. 
 37   
 38  This module helps by providing a set of standard definitions for the 
 39  different terms needed in parsing dates, and a way to generate those 
 40  definitions from a relatively easy to understand format string. 
 41   
 42  The syntax of the format string is based on that used by the standard 
 43  unix strftime/strptime functions, with terms taken from the POSIX and 
 44  GNU documentation plus some experimentation.  These terms are in the 
 45  form "%c" where "c" is a single character.  It's hard to remember 
 46  everything through a not always mnemonic single character code, so 
 47  Martel.Time adds a new syntax of the form "%(word)" where word can be 
 48  one of the single characters, or a multicharacter word.  For example, 
 49  "%(Mon)" is identical to "%a" but easier to understand. 
 50   
 51  The complete list of definitions is given below. 
 52   
 53  The lowest-level terms (like "year", but excluding terms like "%D" 
 54  which expand to other terms) are inside of named groups, which 
 55  generate the element tag and attributes when used for Martel. 
 56   
 57  For example, "%m" generates the pattern used for a month, from "01" to 
 58  "12".  The name of the group is "month" and it has a single attribute 
 59  named "type" with value "numeric".  (All "numeric" types can be parsed 
 60  with Python's 'int' function.)  The default pattern made from "%m" is 
 61   
 62      (?P<month?type=numeric>(0[1-9]|1[012])) 
 63   
 64  and when parsed against a month value, like "05", produces 
 65   
 66      <month type="numeric">05</month> 
 67   
 68  The "type" attribute is used because values which mean the same thing 
 69  can be represented in different formats.  The month "January" can be 
 70  represented with the word "January" (type = "long"), "Jan" (type = 
 71  "short"), "01" (type = "numeric"), "1" (type = "numeric"), or " 1" 
 72  (type = "numeric").  [Note: It is possible that subtypes may be added 
 73  in the future to distinguish between these different numeric cases.] 
 74   
 75   
 76  FUNCTIONS: 
 77   
 78  There are two public functions -- "make_pattern" and 
 79  "make_expression".  Both take the same parameters and return a regular 
 80  expression. 
 81   
 82    make_pattern(format, tag_format = "%s") -- returns the expression 
 83         as a pattern string 
 84   
 85    make_expression(format, tag_format = "%s") -- returns the expression 
 86         as a Martel.Expression data structure (which can be used to 
 87         make a parser) 
 88   
 89  The first parameter, "format", is the time format string already 
 90  discussed.  Some examples are: 
 91   
 92    >>> from Martel import Time 
 93    >>> Time.make_pattern("%y") 
 94    '(?P<year?type=short>\\\\d{2})' 
 95    >>> Time.make_pattern("%H:%M") 
 96    '(?P<hour?type=24-hour>([01][0-9]|2[0-3]))\\\\:(?P<minute?type=numeric>[0-5][0-9])' 
 97    >>> 
 98   
 99  The second parameter is used if you want to change the tag name.  For 
100  example, instead of "year" you may want "year-modified" or 
101  "start-year" -- or you may not want a tag at all. 
102   
103  For each term, the tag name ("year", "month", etc.) is %'ed with the 
104  tag_format string.  The default string is "%s" which effectively says 
105  to keep the name unchanged.  Here are a couple examples which use a 
106  different string. 
107   
108    >>> from Martel import Time 
109    >>> Time.make_pattern("%(year)", "%s-modified") 
110    '(?P<year-modified?type=any>([0-9]{2}([0-9]{2})?))' 
111    >>> Time.make_pattern("%(year)", "start-%s") 
112    '(?P<start-year?type=any>([0-9]{2}([0-9]{2})?))' 
113    >>> Time.make_pattern("%(year)", None) 
114    '([0-9]{2}([0-9]{2})?)' 
115    >>> 
116     
117  The tag_format is used for every tag name, which lets you modify 
118  several values at once.  You can even pass in an object which 
119  implements the __mod__ method to make more drastic changes to the 
120  name. 
121   
122    >>>  Time.make_pattern("%H:%M", "%s-created") 
123    '(?P<hour-created?type=24-hour>([01][0-9]|2[0-3]))\\\\:(?P<minute-created?type=numeric>[0-5][0-9])' 
124    >>> class Upcase: 
125    ...     def __mod__(self, name): 
126    ...         return name.upper() 
127    ... 
128    >>> Time.make_pattern("%H:%M", Upcase()) 
129    '(?P<HOUR?type=24-hour>([01][0-9]|2[0-3]))\\:(?P<MINUTE?type=numeric>[0-5][0-9])' 
130    >>> 
131   
132  BUGS: 
133  Only the "C" locale (essentialy, US/English) is supported.  Field 
134  widths (as in "%-5d") are not supported. 
135   
136  There is no way to change the element attributes.  I'm not sure this 
137  is a bug. 
138   
139           ====  Table of Date/Time Specifiers ==== 
140   
141  %a is replaced by the pattern for the abbreviated weekday name. 
142    Pattern: (Mon|Tue|Wed|Thu|Fri|Sat|Sun) 
143       (the real pattern is case insensitive) 
144    Example: "Wed" "FRI" 
145    Element name: "weekday" 
146    Element attributes: "type" = "short" 
147    Note: %(Mon) is the same as %a 
148     
149  %A is replaced by the pattern for the full weekday name. 
150    Pattern: (Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday) 
151       (the real pattern is case insensitive) 
152    Example: "Thursday" "SUNDAY" 
153    Element name: "weekday" 
154    Element attributes: "type" = "long" 
155    Note: %(Monday) is the same as %a 
156   
157  %b is replaced by the the pattern for the abbreviated month name. 
158    Pattern: (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) 
159       (the real pattern is case insensitive) 
160    Example: "Oct" "AUG" 
161    Element name: "month" 
162    Element attributes: "type" = "short" 
163    Note: %(Jan) is the same as %b 
164     
165  %B is replaced by the pattern for the full month name. 
166    Pattern: (January|February|March|April|May|June|July|August| 
167              September|October|November|December) 
168       (the real pattern is case insensitive) 
169    Example: "August", "MAY" 
170    Element name: "month" 
171    Element attributes: "type" = "long" 
172    Note: %(January) is the same as %B 
173   
174  %c is replaced by the pattern for the US 24-hour date and time 
175     representation. 
176    Pattern: same as "%a %b %e %T %Y" 
177    Example: "Wed Dec 12 19:57:22 2001" 
178    Element: only uses names and attributes of the individual terms 
179   
180  %C is replaced by the pattern for the century number (the year divided 
181     by 100 and truncated to an integer) as a decimal number [00-99]. 
182    Pattern: "[0-9][0-9]" 
183    Example: "19" for the years 1900 to 1999 
184    Element name: "century" 
185    Element attributes: "type" = "numeric" 
186   
187  %d is replaced by the pattern for a day of the month as a decimal 
188     number [01,31]. 
189    Pattern: (0[1-9]|[12][0-9]|3[01]) 
190    Example: "01", "12" 
191    Element name: "day" 
192    Element attributes: "type": "numeric" 
193    Note: "%d" does not include " 1" or "1".  If you also want to allow 
194      those then use "%(day)" 
195     
196  %D same as the pattern for "%m/%d/%y". 
197    Pattern: see "%m/%d/%y". 
198    Example: "12/13/01" 
199    Element: only uses names and attributes of the individual terms 
200     
201  %e is replaced by the pattern for a day of the month as a decimal 
202     number [1,31]; a single digit is preceded by a space. 
203    Pattern: "( [1-9]|[12][0-9]|3[01])" 
204    Example: " 1", "31" 
205    Element name: "day" 
206    Element attributes: "type" = "numeric" 
207    Note: "%e" does not include "01" or "1".  If you also want to allow 
208      those then use "%(day)" 
209     
210  %F same as the pattern for "%Y-%m-%d". 
211    Pattern: see "%Y-%m-%d". 
212    Example: "2001-12-21" 
213    Element: only uses names and attributes of the individual terms 
214   
215  %g ISO 8601 2-digit (like %G but without the century) (00-99) 
216    Pattern: [0-9][0-9] 
217    Example: "00" 
218    Element name: "century" 
219    Element attributes: "type" = "ISO8601" 
220   
221  %G Pattern for the ISO 8601 year with century as a decimal number. 
222     The 4-digit year corresponding to the ISO week number (see %V). 
223     This has the same format and value as %y, except that if the ISO 
224     week number belongs to the previous or next year, that year is 
225     used instead. (TZ) 
226    Pattern: [0-9][0-9][0-9][0-9] 
227    Example: "1954" "2001" 
228    Element name: "year" 
229    Element attributes: "type" = "ISO8601" 
230   
231  %h (DEPRECATED) same as %b.   
232    Pattern: (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) 
233    Example: "Feb" 
234    Element name: "month" 
235    Element attributes: "type" = "short" 
236    Note: %(Jan) is the same as %b is the same as %h 
237     
238  %H is replaced by the pattern for the hour on a 24-hour clock, as 
239     a decimal number [00,23]. 
240    Pattern: ([01][0-9]|2[0-3]) 
241    Example: "00", "01", "23" 
242    Element name: "hour" 
243    Element attributes: "type" = "24-hour" 
244    Note: This does not allow single digit hours like "1".  If you also 
245      want to include those, use %(24-hour) 
246     
247  %I is replaced by the pattern for the hour on a 12-hour clock, as 
248     a decimal number [01,12]. 
249    Pattern: (0[0-9]|1[012]) 
250    Example: "01", "12" 
251    Element name: "hour" 
252    Element attributes: "type" = "12-hour" 
253    Note: This does not allow single digit hours like "1".  If you also 
254      want to include those, use %(12-hour) 
255     
256  %j is replaced by the pattern for day of the year as a decimal 
257     number.  First day is numbered "001" [001,366]. 
258    Pattern: "([12][0-9][0-9]|3([012345][0-9]|6[0-6])|0(0[1-9]|[1-9][0-9]))" 
259    Example: "001", "092", "362" 
260    Element name: "year_day" 
261    Element attributes: "type": "1" 
262   
263  %k is replaced by the pattern for the hour on a 24-hour clock, as a 
264     decimal number (range 0 to 23); single digits are preceded by a 
265     blank. 
266    Pattern: "( [0-9]|1[0-9]|2[0123])" 
267    Example: " 1", "10", "23" 
268    Element name: "hour" 
269    Element attributes: "type" = "24-hour" 
270    Note: This does not allow single digit hours like "1" or hours which 
271      start with an "0" like "03".  If you also want to include those, 
272      use %(24-hour).  See also %H. 
273   
274  %l is replaced by the pattern for the hour on a 12-hour clock, as a 
275     decimal number (range 1 to 12); single digits are preceded by a 
276     blank. 
277    Pattern: "( [0-9]|1[012])" 
278    Example: " 1", "10", "12" 
279    Element name: "hour" 
280    Element attributes: "type" = "12-hour" 
281    Note: This does not allow single digit hours like "1" or hours which 
282      start with an "0" like "03".  If you also want to include those, 
283      use %(12-hour).  See also %I. 
284   
285  %m is replaced by the pattern for the month as a decimal number [01,12]. 
286    Pattern: "(0[1-9]|1[012])" 
287    Example: "01", "09", "12" 
288    Element name: "month" 
289    Element attributes: "type" = "numeric" 
290    Note: This does not allow single digit months like "1" or months which 
291      start with an space like " 3".  If you also want to include those, 
292      use %(month).  See also %(DD), which is an alias for %m. 
293     
294  %M is replaced by the pattern for the minute as a decimal number [00,59]. 
295    Pattern: "[0-5][0-9]" 
296    Example: "00", "38" 
297    Element name: "minute" 
298    Element attributes: "type" = "numeric" 
299    Note: this is the same as %(minute) 
300     
301  %n is replaced by the pattern for the newline character. 
302    Pattern: "\\n" 
303    Note: you shouldn't need to use this 
304   
305  %p is replaced by the case insensitive pattern for "AM" or "PM" 
306    Pattern: "([AaPp][Mm])" 
307    Example: "AM", "pm" 
308    Element name: "ampm" 
309    Element attributes: no attributes 
310    Note: this doesn't allow "a.m." or "P.M." 
311     
312  %P is identical to "%p" (they have slightly different meanings for output) 
313    Pattern: "([AaPp][Mm])" 
314    Example: "am", "PM" 
315    Element name: "ampm" 
316    Element attributes: no attributes 
317    Note: this doesn't allow "a.m." or "P.M." 
318   
319  %r is equivalent to "%I:%M:%S %p". 
320    Pattern: see the patterns for the individual terms 
321    Example: "07:57:22 PM" 
322    Element: only uses names and attributes of the individual terms 
323     
324  %R is the pattern for the 24 hour notation "%H:%M". 
325    Pattern: see the patterns for the individual terms 
326    Example: "19:57" 
327    Element: only uses names and attributes of the individual terms 
328     
329  %s is pattern for a decimal number of seconds (Unix timestamp) 
330    Pattern: "[0-9]+" 
331    Example: "1008205042" 
332    Element name: "timestamp" 
333    Element attributes: no attributes 
334     
335  %S is replaced by the pattern for the second as a decimal number 
336     Can take values from "00" to "61" (includes double leap seconds). 
337    Pattern: "([0-5][0-9]|6[01])" 
338    Example: "03", "25" 
339    Element name: "second" 
340    Element attributes: "type" = "numeric" 
341    Note: This is the same as %(second) 
342     
343  %t is replaced by a tab character. (plat-spec) 
344    Pattern: "\\t" 
345    Note: You shouldn't need to use this. 
346     
347  %T is identical to the 24-hour time format "%H:%M:%S". 
348    Pattern: see the patterns for the individual terms 
349    Example: "19:57:22" 
350    Element: only uses names and attributes of the individual terms 
351     
352  %u is replaced by the pattern for the weekday as a decimal number 
353     [1,7], with "1" representing Monday. 
354    Pattern: "[1-7]" 
355    Example: "4" (which is Thursday) 
356    Element name: "weekday" 
357    Element attributes: "type" = "Monday1" 
358    Note: See also %w, which has a type of "Sunday0" 
359     
360  %U is replaced by the pattern for the week number of the year (Sunday 
361     as the first day of the week) as a decimal number [00,53].  In 
362     other words, this is the number of Sundays seen so far in the year. 
363    Pattern: "([0-4][0-9]|5[0-3])" 
364    Example: "04", "26" 
365    Element name: "week_number" 
366    Element attributes: "type" = "Sunday_count" 
367    Note: See also %V and %W 
368     
369  %V is replaced by the pattern for the week number of the year (Monday 
370     as the first day of the week) as a decimal number [01,53]. This is 
371     used for week numbers where if the week containing 1 January has four 
372     or more days in the new year, then it is considered week 1. (Otherwise, 
373     it is the last week of the previous year, and the next week is week 1.) 
374    Pattern: "(0[1-9]|[1-4][0-9]|5[0-3])" 
375    Example: "04", "33" 
376    Element name: "week_number" 
377    Element attributes: "type" = "type_V" (Got a better short name?) 
378    Note: See also %U and %W.  I don't know when to use this. 
379     
380  %w is replaced by pattern for the the weekday as a decimal number [0,6], 
381     with 0 representing Sunday. 
382    Pattern: "[0-6]" 
383    Example: "6" 
384    Element name: "weekday" 
385    Element attributes: "type" = "Sunday0" 
386    Note: See also %u, which has a type of "Monday1" 
387     
388  %W is replaced by the pattern for the week number of the year (Monday 
389     as the first day of the week) as a decimal number [00,53]. All days 
390     in a new year preceding the first Monday are considered to be in 
391     week 0.  In other words, this is the number of Mondays seen so far 
392     in the year. 
393    Pattern: "([0-4][0-9]|5[0-3])" 
394    Example: "00", "49" 
395    Element name: "week_number" 
396    Element attributes: "type" = "Monday_count" 
397    Note: See also %U and %V. 
398   
399  %x is the same as "%D", which is "%m/%d/%y". 
400    Pattern: see the patterns for the individual terms 
401    Example: "12/13/99" 
402    Element: only uses names and attributes of the individual terms 
403     
404  %X is the same as "%T", which is "%H:%M:%S". 
405    Pattern: see the patterns for the individual terms 
406    Example: "19:57:22" 
407    Element: only uses names and attributes of the individual terms 
408     
409  %y is replaced by the pattern for the year without century, as a 
410     decimal number [00,99]. 
411    Pattern: "[0-9][0-9]" 
412    Example: "89", "01" 
413    Element name: "year" 
414    Element attributes: "type" = "short" 
415    Note: This is the same as %(YY). 
416   
417  %Y is replaced by the pattern for the year, including the century, as a 
418     decimal number. 
419    Pattern: "[0-9][0-9][0-9][0-9]" 
420    Example: "1610", "2002" 
421    Element name: "year" 
422    Element attributes: "type" = "long" 
423    Note: This is the same as %(YYYY). 
424     
425  %z is replaced by the pattern for the time-zone as hour offset from GMT. 
426     (This is used when parsing RFC822-conformant dates, as in 
427       "%a, %d %b %Y %H:%M:%S %z", except that %z does not include the 
428      pattern for a missing timezone -- should I fix that?). 
429    Pattern: "[-+][0-9][0-9][0-9][0-9]" 
430    Example: "-0500"  (for EST), "+0100" (for CET), "+0530" (somewhere in India) 
431    Element name: "timezone" 
432    Element attributes: "type" = "RFC822" 
433   
434  %Z is replaced by a pattern for a timezone name or abbreviation.  (It does 
435      not allow missing timezone field.) 
436    Pattern: "(GMT([+-][0-9][0-9][0-9][0-9])?|[A-Z][a-zA-Z]*( [A-Z][a-zA-Z]*)*)" 
437                (is there anything better?) 
438    Example: "MST", "GMT", "Pacific Standard Time", "GRNLNDST", "MET DST", 
439             "New Zealand Standard Time", "NZST", "SAST", "GMT+0200", "IDT" 
440    Element name: "timezone" 
441    Element attributes: "type" = "name" 
442     
443  %% is replaced by the pattern for "%" (which happens to be "%") 
444    Pattern: "%" 
445    Example: "%" 
446    Element: none 
447   
448   === Martel specific extensions === 
449   
450  %(Mon) is the same as "%a". 
451    Pattern: See the definition for "%a" 
452    Example: "Wed" "FRI" 
453    Element name: "weekday" 
454    Element attributes: "type" = "short" 
455   
456  %(Monday) is the same as "%A". 
457    Pattern: See the definition for "%A" 
458    Example: "Thursday" "SUNDAY" 
459    Element name: "weekday" 
460    Element attributes: "type" = "long" 
461   
462  %(Jan) is the same as "%b". 
463    Pattern: See the definition for "%b" 
464    Example: "Feb" 
465    Element name: "month" 
466    Element attributes: "type" = "short" 
467   
468  %(January) is the same as "%B". 
469    Pattern: See the definition for "%B" 
470    Example: "August", "MAY" 
471    Element name: "month" 
472    Element attributes: "type" = "long" 
473   
474  %(second) is the same as "%S". 
475    Pattern: See the definition for "%S". 
476    Example: "03", "25" 
477    Element name: "second" 
478    Element attributes: "type" = "numeric" 
479   
480  %(minute) is the same as "%M". 
481    Pattern: See the definition for "%M" 
482    Example: "00", "38" 
483    Element name: "minute" 
484    Element attributes: "type" = "numeric" 
485   
486  %(12-hour) is replaced by the pattern for a 12 hour clock in any of 
487     the common formats.  (Numeric values from 1 to 12.) 
488    Pattern: "(0[1-9]|1[012]?|[2-9]| [1-9])" 
489    Example: "2", "02", " 2", "10" 
490    Element name: "hour" 
491    Element attributes: "type" = "12-hour" 
492   
493  %(24-hour) is replaced by the pattern for a 24 hour clock in any 
494     of the common formats.  (Numeric values from 0 to 23.) 
495    Pattern: "([01][0-9]?|2[0123]?|[3-9]| [1-9])" 
496    Example: "9", "09", " 9", "00", "0", " 0", "23" 
497    Element name: "hour" 
498    Element attributes: "type" = "24-hour" 
499   
500  %(hour) is replaced by the pattern for any hour in either a 
501     12-hour or 24-hour clock. 
502    Pattern: "([01][0-9]?|2[0123]?|[3-9]| [1-9])" 
503       (this happens to be the same as %(24-hour) 
504    Example: "9", "09", " 9", "00", "0", " 0", "23" 
505    Element name: "hour" 
506    Element attributes: "type" = "any" 
507   
508  %(day) is replaced by the pattern for the day of the month as a decimal 
509     in any of the common day format 
510    Pattern: "(0[1-9]|[12][0-9]?|3[01]?|[4-9]| [1-9])" 
511    Example: "9", "09", " 9", and "31" 
512    Element name: "day" 
513    Element attributes: "type" = "numeric" 
514   
515  %(DD) is the same as "%d", which is the pattern for a day of the month 
516     as a decimal number [01,31]. 
517    Pattern: See the definition for "%d" 
518    Example: "09", "31" 
519    Element name: "day" 
520    Element attributes: "type" = "numeric" 
521   
522  %(month) is replaced by the pattern for the month as a decimal in any 
523     of the common month formats. 
524    Pattern: "(0[1-9]|1[012]?|[2-9]| [1-9])" 
525    Example: "5", "05", " 5", and "12". 
526    Element name: "month" 
527    Element attributes: "type" = "numeric" 
528    Note: See also "%m" and %(MM). 
529   
530  %(MM) is the same as "%m", which is a two-digit month number [01,12] 
531    Pattern: See the definition for "%m" 
532    Example: "05", "01", and "12". 
533    Element name: "month" 
534    Element attributes: "type" = "numeric" 
535    Note: See also %(month). 
536   
537  %(YY) 
538    Pattern: "[0-9][0-9]" 
539    Example: "10" 
540    Element name: "year" 
541    Element attributes: "type" = "short" 
542   
543  %(YYYY) 
544    Pattern: "[0-9][0-9][0-9][0-9]" 
545    Example: "1970" 
546    Element name: "year" 
547    Element attributes: "type" = "long" 
548   
549  %(year) is replaced by the pattern accepting 2 digit and 4 digit year formats. 
550    Pattern: "([0-9]{2}([0-9]{2})?)" 
551    Example: "2008", "97" 
552    Element name: "year" 
553    Element attributes: "type" = "any" 
554    Note: Need to change this before the year 10,000 
555   
556  """ 
557  import string, Martel, Expression 
558   
559  # Letters are case independent. 
560 -def _any_case(s):
561 t = "" 562 for c in s: 563 if c in string.letters: 564 t = t + "[%s%s]" % (string.upper(c), string.lower(c)) 565 else: 566 t = t + c 567 return t
568 569 _time_fields = ( 570 ("a", _any_case("(Mon|Tue|Wed|Thu|Fri|Sat|Sun)"), 571 "weekday", {"type": "short"}), 572 ("A", _any_case("(Monday|Tuesday|Wednesday|Thursday|Friday|" 573 "Saturday|Sunday)"), 574 "weekday", {"type": "long"}), 575 ("b", _any_case("(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)"), 576 "month", {"type": "short"}), 577 ("B", _any_case("(January|February|March|April|May|June|July|August|" 578 "September|October|November|December)"), 579 "month", {"type": "long"}), 580 ("C", "\d\d", 581 "century", {"type": "numeric"}), 582 ("d", "(0[1-9]|[12][0-9]|3[01])", # day of month, "01".."31" 583 "day", {"type": "numeric"}), 584 ("e", "( [1-9]|[12][0-9]|3[01])", # day of month, " 1".."31" 585 "day", {"type": "numeric"}), 586 ("g", r"\d{2}", # ISO 8601 century, 2001 is "01" 587 "century", {"type": "ISO8601"}), 588 ("G", r"\d{4}", # ISO 8601 year, 2001 589 "year", {"type": "ISO8601"}), 590 ("H", "([01][0-9]|2[0-3])", # hours, "00".."23" 591 "hour", {"type": "24-hour"}), 592 ("I", "(0[0-9]|1[012])", # hours, "01".."12" 593 "hour", {"type": "12-hour"}), 594 595 # day of year, "001".."366" 596 ("j", "([12][0-9][0-9]|3([012345][0-9]|6[0-6])|0(0[1-9]|[1-9][0-9]))", 597 "year_day", {"type": "1"}), 598 599 ("k", "( [0-9]|1[0-9]|2[0123])", # hour, " 0".."23" 600 "hour", {"type": "24-hour"}), 601 ("l", "( [0-9]|1[012])", # hour, " 1", " 2", .. "12" 602 "hour", {"type": "12-hour"}), 603 ("m", "(0[1-9]|1[012])", # month, "01", "02", .. "12" 604 "month", {"type": "numeric"}), 605 ("M", "[0-5][0-9]", # minute, "00", .. "59" 606 "minute", {"type": "numeric"}), 607 ("n", r"\n", None, None), 608 ("p", "([AaPp][Mm])", # AM 609 "ampm", {}), 610 ("P", "[aApP][mM]", # am 611 "ampm", {}), 612 ("s", r"\d+", # seconds in unix epoch 613 "timestamp", {}), 614 ("S", "([0-5][0-9]|6[01])", # second, [00,61] 615 "second", {"type": "numeric"}), 616 ("t", r"\t", None, None), 617 ("u", "[1-7]", # weekday, [1,7], 1=Monday 618 "weekday", {"type": "Monday1"}), 619 ("U", "([0-4][0-9]|5[0-3])", # week number, [00,53], Sunday first day 620 "week_number", {"type": "Sunday_count"}), 621 ("V", "(0[1-9]|[1-4][0-9]|5[0-3])", # week number, [01,53], Monday first day, split 622 "week_number", {"type": "type_V"}), # when is this used? 623 ("w", "[0-6]", # weekday, [0,6], 0=Sunday 624 "weekday", {"type": "Sunday0"}), 625 ("W", "([0-4][0-9]|5[0-3])", # week number, [00,53], Monday first day, all 626 "week_number", {"type": "Monday_count"}), 627 ("y", r"\d{2}", # 2001 is "01" 628 "year", {"type": "short"}), 629 ("Y", r"\d{4}", # 2000 630 "year", {"type": "long"}), 631 ("z", r"[-+]\d{4}", 632 "timezone", {"type": "RFC822"}), 633 # "MST", "GMT", "Pacific Standard Time", "GRNLNDST", "MET DST", 634 # "New Zealand Standard Time", "NZST", "SAST", "GMT+0200", "IDT" 635 ("Z", r"(GMT([+-]\d{4})?|[A-Z][a-zA-Z]*( [A-Z][a-zA-Z]*)*)", 636 "timezone", {"type": "name"}), 637 ("%", "%", None, None), 638 639 # These are dependent on other fields 640 ("D", "%m/%d/%y", None, None), # 06/29/01 641 ("F", "%Y-%m-%d", None, None), # 2001-06-29 642 ("h", "%b", None, None), # Jan, Feb, ... Dec 643 ("r", "%I:%M:%S %p", None, None),# "05:07:50 AM" 644 ("R", "%H:%M", None, None), # "05:07", "19:57" 645 ("T", "%H:%M:%S", None, None), # "01:00:49" 646 ("x", "%D", None, None), # no locale, 09/23/01 647 ("X", "%T", None, None), # no locale, 00:59:18 648 ("c", "%a %b %e %T %Y", "date", {}), # "Wed Dec 2 19:57:22 2001" 649 650 # I made these up 651 ("Mon", "%a", None, None), 652 ("Monday", "%A", None, None), 653 ("Jan", "%b", None, None), 654 ("January", "%B", None, None), 655 ("second", "%S", None, None), 656 ("minute", "%M", None, None), 657 ("12-hour", r"(0[1-9]|1[012]?|[2-9]| [1-9])", 658 "hour", {"type": "12-hour"}), 659 ("24-hour", r"([01][0-9]?|2[0123]?|[3-9]| [0-9])", 660 "hour", {"type": "24-hour"}), 661 ("hour", r"([01][0-9]?|2[0123]?|[3-9]| [0-9])", 662 "hour", {"type": "any"}), 663 ("day", r"(0[1-9]|[12][0-9]?|3[01]?|[4-9]| [1-9])", 664 "day", {"type": "numeric"}), 665 ("DD", "%d", None, None), 666 ("month", r"(0[1-9]|1[012]?|[2-9]| [1-9])", "month", {"type": "numeric"}), 667 ("MM", "%m", None, None), 668 ("YY", r"[0-9]{2}", "year", {"type": "short"}), 669 ("YYYY", r"[0-9]{4}", "year", {"type": "long"}), 670 ("year", r"([0-9]{2}([0-9]{2})?)", "year", {"type": "any"}), 671 ) 672 _time_table = {} 673 for spec, pat, tag, attrs in _time_fields: 674 _time_table[spec] = (pat, tag, attrs) 675 for v in _time_table.values(): 676 v = v[0] 677 assert (v[0] == '(' and v[-1] == ')') or '|' not in v, v 678
679 -def make_pattern(format, tag_format = "%s"):
680 """format, tag_format = "%s" -> regular expression pattern string 681 682 Turn the given time format string into the corresponding regular 683 expression string. A format term may contain a Group name and attribute 684 information. If present, the group name is %'ed with the 685 tag_format to produce the tag name to use. Use None to specify 686 that named groups should not be used. 687 688 >>> from Martel import Time 689 >>> print Time.make_pattern("%m-%Y)", "created-%s") 690 (?P<created-month?type=numeric>(0[1-9]|1[012]))\\-(?P<created-year?type=long>\\d{4})\\) 691 >>> 692 693 See the Time module docstring for more information. 694 695 """ 696 return _parse_time(format, tag_format, 697 text_to_result = Expression.escape, 698 group_to_result = Expression._make_group_pattern, 699 re_to_result = lambda x: x, 700 t = "")
701
702 -def make_expression(format, tag_format = "%s"):
703 """format, tag_format = "%s" -> Martel Expresion 704 705 Turn the given time format string into the corresponding Martel 706 Expression. A format term may contain a Group name and attribute 707 information. If present, the group name is %'ed with the 708 tag_format to produce the tag name to use. Use None to specify 709 that named groups should not be used. 710 711 >>> from Martel import Time 712 >>> from xml.sax import saxutils 713 >>> exp = Time.make_expression("%m-%Y\\n", "created-%s") 714 >>> parser = exp.make_parser() 715 >>> parser.setContentHandler(saxutils.XMLGenerator()) 716 >>> parser.parseString("05-1921\n") 717 <?xml version="1.0" encoding="iso-8859-1"?> 718 <created-month type="numeric">05</created-month>-<created-year type="long">1921</created-year> 719 >>> 720 721 See the Time module docstring for more information. 722 723 """ 724 return _parse_time(format, tag_format, 725 text_to_result = Martel.Str, 726 group_to_result = Martel.Group, 727 re_to_result = Martel.Re, 728 t = Martel.NullOp())
729
730 -def _use_tag_format(tag_format, name):
731 if not tag_format: 732 return "" 733 return tag_format % name
734 735 # text_to_result converts an exact string to the result expression type 736 # 737 # group_to_result converts (name, subexpression, attrs) into the 738 # result expression type (the subexpression is in the correct type) 739 # 740 # re_to_result converts a regular expression pattern string into 741 # the result expression type 742 # 743 # Partial results are "+"ed to 't'. 744 #
745 -def _parse_time(s, tag_format, text_to_result, group_to_result, 746 re_to_result, t):
747 initial_t = t 748 n = len(s) 749 end = 0 750 while end < n: 751 prev = end 752 # Is there another '%' term? 753 start = string.find(s, "%", end) 754 if start == -1: 755 break 756 end = start + 1 757 758 if end == n: 759 end = prev 760 break # ended with '%' as last character; keep it 761 762 c = s[end] 763 # Is this a %() escape? 764 if c == '(': 765 pos = string.find(s, ")", end) 766 if pos != -1: 767 # We have a %(special) construct 768 c = s[end+1:pos] 769 else: 770 raise TypeError("Found a '%%(' but no matching ')': %s" % \ 771 (repr(s[end-1:]),)) 772 end = pos 773 774 # Get the expansion and, if needed, do the recursion 775 pat, name, attrs = _time_table[c] 776 if "%" in pat and pat != "%": 777 pat = _parse_time(pat, tag_format, 778 text_to_result, group_to_result, 779 re_to_result, initial_t) 780 else: 781 pat = re_to_result(pat) 782 if name is not None: 783 fullname = _use_tag_format(tag_format, name) 784 if fullname: 785 pat = group_to_result(fullname, pat, attrs) 786 787 if prev + 1 > start: 788 t = t + pat 789 else: 790 t = t + text_to_result(s[prev:start]) + pat 791 end = end + 1 792 793 if end < n: 794 t = t + text_to_result(s[end:]) 795 return t
796