Reports folder

From Cumulus Wiki

Cumulus Version MX SpecificCumulus Version 1 SpecificThis page applies to both flavours. It has been updated to cover all releases up to 3.12.0.

The Reports folder

All flavours of Cumulus software include in their release distribution, a sub-folder called Reports. Please note that it starts with a capital letter, whilst (for Cumulus 1 and 2, all; for MX, most) other folders have totally lower case names.

When you first install Cumulus this folder is empty (for technical readers, MX releases do have one hidden file in the folder, .gitignore), it is just created in case you decide to use report functionality.


Optional functionality

The generation of files to be stored in this folder is optional functionality:

Cumulus MX MX settings Cumulus 1.9.x    Configuration menu
Switch this functionality on by selecting NOAA Settings in the Settings menu.

(In image "month specification" is abbreviated to "MS")

NOAA settings.png    Switch this functionality on by selecting NOAA setup from the Configuration menu.

When you make that selection, you will be presented with settings similar to those shown for MX, although the layout is different!

Cumulus Configuration Menu.png

NOAA Report Settings

The various settings are listed here for MX and here for the legacy software. You will get the maximum understanding from reading both those links.

Basically, some settings determine text to appear in the reports, and others specify thresholds used for various calculations. There is a mixture of settings that have a default, and settings that are initially blank.

There are defaults for the file names, and the next section will explain that although you can change these, such changes will stop you using any scripts provided for viewing these reports on your web site, so think about your technical skills and whether you can write your own scripts, or modify supplied scripts.

City and State are American terminology. The actual content of the first might be village, town, district, or city. The actual content of the second might be a region, area name, or county.

Looking at the other initially blank settings, most of these are for monthly figures. You decide where to get the figures from, your local meteorological service might publish climate figures based on 10-year or 30-year averages as required by World Meteorological Office, your tourist authority may have some figures for your area, or there might be another weather station not far away that you can ask.


NOAA style Report Naming

The files that hold the report content, have to indicate which month, or year, they cover. This means the file names consist of a fixed prefix literal, then a code representing a date, and finally the literal ".txt" to confirm to the operating system that the report is a text file.

Microsoft Operating Systems use file extensions (that by default are hidden) to indicate file types, other operating systems are less fussy.

Cumulus (both 1 and MX) allow the prefix, and date specifying, parts of the file name to be customised. However, if you use the MX Default Web Site or third-party User Contributions, then these assume you are using the default configuration settings for the release of Cumulus when they were written. For example, this means the date specifier must be purely numerical (to avoid coping with language variants).

The table later will explain all the possibilities for the date specifier. For now, it is important to stress that all parts of the file name are parsed by the date interpreter, and that is why the prefix and suffix to the date specifier must be quoted as literals.

  • For MX, this means all parts of the file name for the report must be understandable when processed by a C# date format parse.
  • The legacy Cumulus uses Delphi to interpret the file name.


For simplicity, the codes recommended by this page will work for both the C# date format parser, and the Delphi interpreter:

  1. The capital letter "M" is the basis for identifying month (e.g. MM is a 2 digit month number)
  2. The lower case "y" is repeated to indicate the number of digits to be used to represent the year
  3. Any fixed text is enclosed by quotes to be treated as a literal.

Viewing reports locally

The local report viewer is able to display these reports, if you customise the file name (because it can find out how you have configured the file name):

So if you are only viewing the reports locally, you can customise the file names.

Viewing reports on your web site

The configuration file contains information like passwords, so it must not be accessible to your web site. The latest file name for your monthly and annual reports are available by using web tags so can be included in a Cumulus template file that is processed by Cumulus and therefore that information can be passed to your web site.

As stated earlier, the default web site provided with MX, and third party report viewers for your web site, assume that the files containing the report text are named by the default names. Third party report viewers will also make assumptions about the encoding used, based on the Cumulus default encoding at the time the third party viewer was written, that may not be the default MX now uses, or the default in Cumulus 1.

If you customise any part of the file name, you must write your own script to be able to interpret the file name to find reports for displaying on your web server. If you want to see these reports on your web server and are not able to write your own scripts, don't modify anything, and skip all instructions about possible alternatives.

The prefix literal

As stated earlier, although the prefix can be customised as explained below, you must leave these at the default if you are using web pages provided by MX or by third-parties to display these reports on your web site.

Cumulus MX accepts single or double quotes to define a literal for these report names.

The yearly reports have a default prefix literal of "NOAAYR". If you are not using standard scripts on your web site, you might choose to use 'Blwyddyn' (Welsh language), "Année" (French language), or the equivalent in your language, to make the file name more meaningful for you.

The monthly reports have a default prefix literal of "NOAAMO". As before, if you are able to write your own script, you might prefer to change this default literal into your own language.

The date modifier between the literals

The default selected by Steve Loft is MMyyyy and yyyy respectively (expressed in a way that suits both Cumulus 1 and MX) so the inserted part is all numerical. As already emphasised, you must keep these defaults if you want to use the MX default web pages, or a third-party supplied report viewing script for your web site. The report viewer in both the legacy Cumulus and the current MX release will accept all options shown here.

Here is a table showing the main alternative options for date modifier, and how they look with the fixed literal prefix and the text file type literal suffix as required for the box where you enter the file name within Cumulus settings.

Cumulus Version 1 SpecificDelphi Specifier for Cumulus 1.9.x Badge vMx.pngMono Specifier for Cumulus MX Explanation What to type into settings box Example of produced name
Yearly report
YYYY or yyyy yyyy This is the default mentioned above, and must be used for standard web page viewers. Only Cumulus 1 is case insensitive, use lower case for full compatibility. "NOAAYR"yyyy".txt" NOAAYR2010.txt
YY or yy yy This represents a 2 digit year number alternative format, if you really feel the need to be different. People using old Microsoft Windows Operating Systems selected this because they needed to keep file names short, at only 8 characters. "NOAAYR"yy".txt" NOAAYR10.txt
Monthly report
mmyyyy (or MMYYYY) MMyyyy This is the standard date specifier, and must be used for any standard web pages for viewing these reports. Note the difference between the case used by MX and the default used by Cumulus 1, that caused some problems for those migrating from Cumulus 1 to early MX releases who kept their cumulus.ini file.

From release 3.3.0 onwards, if you migrate from Cumulus 1 (where case does not matter) to Cumulus MX (where case does matter), MX will rewrite Cumulus.ini, if it reads "NOAAMO'mmyy'.txt" (MX believes "mm" means minutes, not month). It is automatically changed into "NOAAMO'MMyy'.txt" (which works on both Cumulus 1 and MX).

"NOAAMO"MMyyyy".txt" NOAAMO032010.txt
mmyy (or MMyy or mmYY or MMYY) MMyy This is an alternative numerical representation that represents a 2 digit year number alternative format, if you really feel the need to be different. People using old Microsoft Windows Operating Systems selected this because they needed to keep file names short, in fact they changed the prefix too, so the file name was only 8 characters.

Note if you are using Cumulus 1: use the legacy software to change the Cumulus 1 specifier, to match the MX one, certainly before you migrate.

"NOAAMO"MMyy".txt" NOAAMO0310.txt
yyyy-mm (or YYYY-MM) yyyy-MM This is an alternative numerical representation based on part of the ISO 8601 format, this naming format is popular as it results in files being in chronological sequence when listed by file name, however remember that this will not be recognised by any third-party web page script, nor by the MX default web page script. "NOAAMO"yyyy-MM".txt" NOAAM2010-03.txt
yyyymm (or YYYYMM) yyyyMM This is a variant on the previous, it just takes the default and then swaps year and month, again this naming format is popular as it results in files being in chronological sequence when listed by file name, again remember that this will not be recognised by any third-party web page script, nor by the MX default web page script. "NOAAMO"yyyyMM".txt" NOAAM201003.txt
yymm (or YYMM) yyMM This is another variant on the previous two used by those who prefer short file names, again this naming format is popular as it results in files being in chronological sequence when listed by file name, again remember that this will not be recognised by any third-party web page script, nor by the MX default web page script. "NOAAMO"yyyyMM".txt" NOAAM201003.txt
MMMyyyy (or mmmyyyyy or mmmYYYY or MMMYYYY) MMMyyyy This alternative, loses the numerical representation of a month, and inserts a short month name instead. For some locales this abbreviated month will end in a full stop (e.g. Feb. in Australian English), for others it will be just 3 or 4 letters (e.g. Feb in British English). Of course, the locale might produce the month abbreviation (with or without the full stop) in another language (e.g. févr. in French).

Although you may feel this provides a more readable file name, remember it will not work with the standard scripts for your web site.

"NOAAMO"MMMyyyy".txt" NOAAMOMar.2010.txt (for some English locales)
_MMMM_yyyy (other case variants) _MMMM_yyyy In theory, you could have very long file names, with the full month name. In practice, I doubt if anyone chooses this. It is normally unwise to have unnecessary long file names. "NOAAMO"_MMMMyyyy".txt" NOAAMO_February_2010.txt (for some English locales)

Daily automatic report generation

If you have switched on the NOAA report functionality, then in the end-of-day process, Cumulus will output the monthly and yearly reports for the day that has just ended. This means when a new month, or new year, starts, the report is only available from the second day.

All reports are generated by processing data that Cumulus has already stored somewhere, see subsequent subsections as it varies by flavour.


MX

Steve Loft's beta 3.0.0 did not include any code for generating NOAA reports, therefore there is no connection between the way that Cumulus MX from release 3.1.0 generates the reports and the way the legacy Cumulus 1 generated reports.

Mark Crossley does share the source code for MX, this shows that MX creates the monthly and yearly reports afresh each day, reading information from dayfile.txt for past days. This means there is no difference between a report produced automatically by MX at end of day, and a report you manually ask it to produce at any time.

If you change threshold settings for degree days in MX, that only affects subsequent daily summary log entries, so again although the report shows the latest thresholds, these might not apply to all information on the report. You would need to use Mark's Create Missing utility to approximately recalculate all the daily summary log (dayfile.txt) entries before you could manually rerun report generation based entirely on the new thresholds.

Mark's monthly report uses:

  1. Daily average temperature (see below as depends on release)
  2. Highest daily temperature and time (taken from fields 7 and 8 of the daily summary log)
  3. Lowest daily temperature and time (taken from fields 5 and 6 of the daily summary log)
  4. Heating degree days (taken from field 41 of the daily summary log)
  5. Cooling degree days (taken from field 42 of the daily summary log)
  6. Daily Rainfall (taken from field 15 of the daily summary log)
  7. Average Wind Speed (calculated from field 17 of daily summary log, by dividing by 24 - the number of hours in a day)
  8. Highest daily average wind speed and time (taken from fields 18 and 19 of the daily summary log)
  9. Dominant Direction (taken from field 40 of the daily summary log)

DAILY AVERAGE shown for each day of month on monthly report:

  • For releases 3.1.0 to 3.11.4, taken from field 16 of daily summary log file
  • From release 3.12.0, there is a choice between that integrated average (calculated from every temperature reading processed), and the WMO average based on limited manual observations (here calculated from adding fields 5 and 7, then dividing by 2)

MONTHLY AVERAGE shown on monthly and annual reports

  • For the summary line at the bottom of the monthly table:
    • The average shown is the average of all the figures in the column above
  • For the line shown for each month on the yearly report:
    • For releases 3.1.0 to 3.4.3, the figure shown is based on the same calculation as the summary line of the monthly table
    • For release 3.4.4 to 3.11.4, the figure shown is based on averaging field 16 of daily summary log file for all days in that month (i.e. the integrated average calculated from all temperature readings available)
    • From release 3.12.0, there is the choice between averaging field 16, or averaging both fields 5 and 7
  • For the summary line at the bottom of the yearly table:
    • For releases 3.1.0 to 3.4.3, the figure shown is based on averaging the figures in the column above, as if all months had the same number of days
    • For release 3.4.4 to 3.11.4, the figure shown is based on averaging field 16 of daily summary log file for all days in that year (i.e. the integrated average calculated from all temperature readings available)
    • From release 3.12.0, there is the choice between averaging field 16, or averaging both fields 5 and 7

Cumulus 1 and Cumulus 2

Although Steve Loft never shared his code for Cumulus 1, he did hint that his software retained content for the current reports.

In other words, the old report is used as the basis for any new one, the generation procedure just calculates the new line to add to a monthly report, or the line for this month in a yearly report. Then his generation procedure updates the summary.

Consequently, it seems his reports were constructed from very accurate data as each line was being made as data was read from the weather station during the day. Steve warned that for reports produced normally (in end of day process) you would make the lines incompatible by any changes you make to settings, such as thresholds during the month or year covered by the report. This confirms that the legacy Cumulus does not recalculate earlier information according to latest settings, although it shows the latest thresholds in text on the report, as if they apply to all information on the report.

You can of course change thresholds, Steve provided for that, by allowing the regeneration of past reports. Because Cumulus does not track whether you have changed thresholds, when you do choose to manually generate any reports in Cumulus 1, it ignores daily summary entries in dayfile.txt that might have been invalidated by subsequent changes to thresholds. As Cumulus does not have access to every reading it processed originally from the weather station, the manual report generation process uses the standard log file entries (these hold periodic spot values that typically miss any extremes), as the source and recalculates all the derived values shown on the report. Steve warned "Be aware that a regenerated report for a past period might not be quite as accurate as the report that Cumulus can generate as part of end of day processing", confirming that the processes for originally producing the report and for subsequently regenerating past reports were different.

Bug in some NOAA reports

The average wind speed used for NOAA reports was, by a bug, based on midnight to midnight days regardless of rollover time in use, for any reports produced from when the report functionality was first added to Cumulus 1, up to 1.9.4 build 1085 only:

  • From 1.9.4 build 1086, the calculation is correctly based on the rollover time being used.


A brief history of these reports

The reports were first added to Cumulus 2 after someone asked, in enhancement request 44 (the enhancement register is no longer available), if Cumulus software could copy what a rival weather software package (Weatherlink) did, and output some climatological reports for both Monthly and Yearly periods.

The idea was to present the weather data that Cumulus software processed, with analysis against various thresholds, and comparison against climate normals; these are figures based on averaging ten to thirty year past periods, i.e. periods as defined by World Meteorological Organisation (WMO).

Steve Loft did some investigation, and found that Ken True had implemented the Weatherlink reports in his Saratoga suite, so Steve Loft took that as his starting point (see Steve's post on the Cumulus Support Forum for more details).

These Saratoga reports were based on climatological reports (annnual report and monthly report), issued by The US National Oceanic and Atmospheric Administration's National Weather Service (hence why Steve Loft decided to use NOAA in the naming of the reports).

Steve Loft was developing Cumulus 2 at the time, so NOAA reports were implemented in that. Subsequently, Steve Loft abandoned Cumulus 2, and his NOAA report design was subsequently added to version 1.9.2 (build 1004) released in July 2011. Steve Loft did not include this report functionality in his Cumulus 3 (MX) beta.

Mark Crossley added these reports to MX at release 3.1.0, using the same layout and same report naming as the legacy, but he had to reinvent the calculations.

Monthly report

For all flavours of Cumulus, the summary at the base of the monthly report is calculated as follows:

  • the arithmetic average is calculated, and shown, for Mean Temp column and Wind Speed column
  • the total is calculated, and shown, for the Rain column
  • the highest figure in column above is found, and shown, for all columns headed High or Low
    • the number in time columns represents the day number where the High/Low figure shown was reported
  • There are 4 thresholds for temperature, and the number of days below or above that threshold is counted
  • For each rain threshold, a count is made of days with rainfall above that figure

Annual Report

The annual report has a number of tables. Within each table a line appears for each month available for the year.

That line repeats some of the information appearing in the summary line of the respective monthly report.


Temperature table

In the settings, you can enter a month by month normal for temperature, and this report calculates, and shows, the divergence from that normal.

In the summary line:

  • for the the Mean Max, Mean Min, and Dep. From Norm columns, the figure shown is the arithmetic average of figures above
  • for the mean column,
    • from release 3.12.0, there is (as described earlier) a choice about which fields in Dayfile.txt, the daily summary file, are used to calculate this figure.
    • for releases 3.4.4 to 3.11.4, the figure shown is the mean of all entries for this year in field 16 of daily summary log file
    • in earlier MX releases, and in the legacy Cumulus, the figure shown is the arithmetic average of figures above
  • totals of figures above are shown for the Heat Deg Days, Cool Deg Days, and Max/Min comparisons with thresholds (see earlier for the implications if you change those thresholds)

Precipitation table

Similar to previous table, information shown represents monthly figures and divergences from normal. All should be self-explanatory. The month with most rain is shown in the summary line.

Wind Speed table

The simplest table, it shows average and highest for each month, together with dominant direction for each month. The summary line is based on the information in each column.

Note: the annual average wind speed, from release 3.4.4 is the mean of all entries for this year in field 18 of daily summary log file; but for the legacy Cumulus, and for releases 3.1.0 to 3.4.3, the figure shown is the arithmetic average of figures above

Format of reports

All reports are pure text files.

However, they contain more than just A to Z, 0 to 9, and punctuation. The additional symbol characters included in the report mean that it is important that the report and whatever is being used to view it are set to use the same set of character codes, and the way that binary representations of characters are made depends on encoding.

Encoding

When Steve Loft started writing Cumulus to run in a Microsoft environment,he selected the character set now defined as iso-8859-1 which was used by Microsoft products at the time like Excel, Notepad. So Steve ensured all Cumulus Files used that encoding. Subsequently, Steve Loft discovered that modern web pages use UTF-8 encoding.

In April 2014, Steve introduced the choice in Cumulus 1 of either ISO-8859-1 encoding (as he used originally) or UTF-8 encoding (what he migrated his web page templates to) for these reports. For backwards compliance, the default encoding for reports selected by Steve Loft is his original ISO-8859-1 encoding, but his recommendation strongly expressed was that users should switch to UTF-8.

NOAA reports could be viewed externally using Microsoft Notepad (in the past, that defaulted to iso-8859-1 encoding) so Cumulus users were happy with reports in the default encoding.

In Cumulus 1, NOAA reports could be viewed by a selection from the View menu. This internal viewer could look up which encoding had been selected, and therefore could ensure the reports were displayed correctly.

Consistency for encoding

To add just a little more detail here, if you choose to implement a web page to display these Cumulus reports, then the HTML of the web page to display the report, the JavaScript that selects which report to show, and inserts the report into the HTML, and the report itself must all use the same encoding, otherwise you will not get characters like the degree symbol ° displaying correctly.

Steve Loft's software packages did not supply a web page for viewing these reports on web servers. However, when third party web pages for viewing the reports were made available, people started seeing strange characters in their reports. The authors of the new web pages could choose which encoding to use, but found whichever they selected, some potential users had their reports encoded with the other!

How did MX complicate encodings?

MX initially complicated the issue.

It provided a web page for displaying reports (/CumulusMX/interface/noaayearreport.html), as part of its admin interface. This web page includes <meta charset="utf-8"> to set the encoding.

For releases 3.1.0 to 3.9.6, MX maintained consistency with the legacy Cumulus by having the default encoding for reports set at ISO-8859-1. On the web page for NOAA settings (/CumulusMX/interface/noaasettings.html), the hint made no mention that the default encoding was inconsistent with the viewer!

From release 3.10.0, MX has made UTF-8 encoding the default for these reports, and the default web pages provided, from that release onwards, include one for showing reports. Further consistency improvements are made from release 3.12.0.

What encoding does my web page use?

Put simply, most modern web pages start with this:

<!DOCTYPE html>
<!-- the above must be on the first line by itself and tells the browser that HTML 5 applies -->
<html lang="en"><!-- modify this to indicate your language -->
<head>
	<meta charset="UTF-8"><!-- assigns the recommended standard encoding that copes with all international characters -->
...

The last line shown there is critical, it indicates that the web page uses "utf-8" encoding.

You will find that all standard web templates included with MX start as shown above.

For Cumulus 1, from build 1094 up the various builds defined for final release, the above code is used.

However for earlier builds of Cumulus 1, the standard web pages start as follows:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

<head>
<meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type" />

The last line there shows how the original web templates (designed by Beth Loft) used the ISO 8859-1 character set. The original NOAA reports also used ISO-8859-1 encoding.

As Cumulus 1 does not contain any web page to display NOAA reports, legacy Cumulus users were forced to install third-party scripts to display the reports. Unfortunately there was potential for inconsistency between the encoding selecting for the web page that included a place to display the report, the encoding the package expected Cumulus to use, and the encoding selected in the script that found the report and read it into the web page. Thus you can find in the support forum lots of posts from people who saw unexpected characters, or other errors in the reports when they used these third-party packages, due to encoding issues.


TECHNICAL BIT

With that introduction, you can now choose whether to read the rest of this section which uses more technical terminology.

Let me explain that technical term, essentially encoding refers to the character set used by any file.

A computer uses binary, binary can only be in state 0 or state 1, so a combination of 0 and 1 states needs to be defined for every character you want to represent.

What you can include in that character set depends to some extent on how many binary bits are used to be mapped to individual characters. A single computer byte has 8 bits, but some installations might use 7 bits for characters and the last bit for parity or some other controlling use.

If you use 7 bits, you have 127 combinations, enough for standard 26 letters in both capitals, and lower case, plus 10 digits (0 to 9), some punctuation, and some control characters (like new line, end of file, and so on).

If you use 8 bits, a whole byte, you have 254 combinations, and you can start coping with accented letters, with alphabets that don't have 26 letters, and even add some symbols.

You can have more than 8 bits, if you allow multiple bytes to specify a single character. However, this brings in extra complications for the encoding, as you need to define how many bytes are being used, and the order in which the bits within the multiple bytes are used, for example some encodings will work forward through bits within bytes, but backwards through the individual bytes.

Obviously, once you start using more than one byte, you can have 16, 32, 64, or even more bits to use and can include lots more characters and the bigger character sets start including lots of symbols and the biggest add smilies or emotion icons.

With any fixed number of bits available, there will be a limit to how many characters can be defined, and different organisations might select different characters to include. Modern encodings often focus on including emotion icons and are written in the context of communicating messages. In older mathematical contexts, a division symbol or a degree symbol might be critical. In more technical contexts, the focus might be on different types of arrows, some other drawing aid, or other technical symbols.

This is what leads to multiple encoding standards. Is a particular arrangement of bits to be used to represent the degree symbol, to represent an envelope symbol, to represent a flow chart segment, to represent a smiling face, or do you need several currency symbols? The general problem is that unless you match the encoding used initially, any retrieval cannot know what character to display for certain combinations of bits.

The encoding variability even extends to lower and upper case for letters. Some encodings put capital letters at lower binary values than lower case letters, and some put capitals at the higher binary values.