Reports folder

From Cumulus Wiki

Cumulus Version MX SpecificCumulus Version 1 SpecificThis page applies to both flavours.


All flavours of Cumulus software include in their release distribution, a sub-folder called Reports. Please note that it starts with a capital letter, whilst (for Cumulus 1 and 2, all; for MX, most) other folders have totally lower case names.

Optional functionality

The generation of files to be stored in this folder is optional functionality:

Cumulus MX MX settings . Cumulus 1.9.x configuration menu
switch this functionality on by selecting NOAA Settings in the Settings menu. The various settings are explained here. (In image "month specification" is abbreviated to "MS") NOAA settings.png switch this functionality on by selecting NOAA setup from the Configuration menu. Cumulus Configuration Menu.png


NOAA style Report Naming

The report name has to change as the current period (month or year) changes, so Reports must be named using date formatting. For MX, this means all parts of the file name for the report must be understandable when processes by a C# date format parse. The legacy Cumulus uses Delphi to interpret the file name.

In practice this means that the capital letter "M" is the basis for identifying month (e.g. MM is a 2 digit month number), lower case "y" are repeated to indicate the number of digits to be used to represent the year, and any fixed text is enclosed by quotes to be treated as a literal.

Microsoft Operating Systems use file extensions (that by default are hidden) to indicate file types, so all report names must end with the literal ".txt" toconfirm to the operating system that the report is a text file.


The prefix literal

The monthly reports have a default prefix literal of "NOAAMO". Some people have changed this into their own language.

The yearly reports have a default prefix literal of "NOAAYR". I have found Cumulus will actually accept alternatives, you might want to use 'Blwyddyn' (Welsh language) or "Année" (French language) or the equivalent in your language.

Cumulus accepts single or double quotes to define a literal for these report names.

Whilst the Viewer provided in Cumulus will accept whatever you put in quotes for this literal, the MX Default Web Site will only accept the default.

The date modifier between the literals

The default selected by Steve Loft is MMyyyy and yyyy respectively (expressed in a way that suits both Cumulus 1 and MX) so the inserted part is all numerical. Here is a table showing the main alternative options for date modifier, and how they look with the fixed literal prefix and the text file type literal suffix as required for the box in settings.

The report viewer in both the legacy Cumulus and the current MX release will accept all options shown here. However, the default website in current MX releases will only accept the default settings.

Cumulus Version 1 SpecificDelphi Specifier for Cumulus 1.9.x Badge vMx.pngMono Specifier for Cumulus MX Explanation Setting to use that suits both flavours Example of produced name
Yearly report
YYYY or yyyy yyyy Note that Cumulus 1 accepts lower or upper case, this is the default mentioned above "NOAAYR"yyyy".txt" NOAAYR2010.txt
YY or yy yy Note that Cumulus 1 accepts lower or upper case, this represents a 2 digit year number alternative format "NOAAYR"yy".txt" NOAAYR10.txt
Monthly report
mmyyyy (or MMYYYY) MMyyyy Note that Cumulus 1 accepts lower or upper case, these are equivalent to default mentioned above, so this is most common for users who first encounter with Cumulus is with MX flavour "NOAAMO"MMyyyy".txt" NOAAMO032010.txt
mmyy (or MMyy or mmYY or MMYY) MMyy Note that Cumulus 1 accepts lower or upper case, this represents a 2 digit year number alternative format, this was the format frequently selected by Cumulus 1 users as it keeps file names as short as possible "NOAAMO"MMyy".txt" NOAAMO0310.txt
yyyy-mm (or YYYY-MM) yyyy-MM Note that Cumulus 1 accepts lower or upper case, this naming format is popular as it results in files being in chronological sequence when listed by file name "NOAAMO"yyyy-MM".txt" NOAAM2010-03.txt
MMMyyyy MMMyyyy Note that Cumulus 1 accepts lower or upper case, this represents an informative naming format using 3 letter month name as defined for your locale on your device, in .NET or in MONO so it is used by those who want to quickly spot which report they want to look at. "NOAAMO"MMMyyyy".txt" NOAAMOMar2010.txt (for English locales)

If you migrate from Cumulus 1 (where case does not matter) to Cumulus MX (where case does matter), from version 3.3.0 onwards the NOAA default monthly name if it reads "NOAAMO'mmyy'.txt" (MX believes "mm" means minutes, not month) is changed into "NOAAMO'MMyy'.txt" (which works on both Cumulus 1 and MX).


A brief history of these reports

The reports were first added to Cumulus 2 after someone asked, in enhancement request 44 (the enhancement register is no longer available), if Cumulus software could copy what a rival weather software package (Weatherlink) did, and output some climatological reports for both Monthly and Yearly periods.

The idea was to present the weather data that Cumulus software processed, with analysis against various thresholds, and comparison against climate normals; these are figures based on averaging ten to thirty year past periods, i.e. periods as defined by World Meteorological Organisation (WMO).

Steve Loft did some investigation, and found that Ken True had implemented the Weatherlink reports in his Saratoga suite, so Steve Loft took that as his starting point (see Steve's post on the Cumulus Support Forum for more details).

These Saratoga reports were based on climatological reports (annnual report and monthly report), issued by The US National Oceanic and Atmospheric Administration's National Weather Service (hence why Steve Loft decided to use NOAA in the naming of the reports).

Steve Loft was developing Cumulus 2 at the time, so NOAA reports were implemented in that. Subsequently, Steve Loft abandoned Cumulus 2, and his NOAA report design was subsequently added to version 1.9.2 (build 1004) released in July 2011. Steve Loft did not include this report functionality in his Cumulus 3 (MX) beta.

Mark Crossley added these reports to MX at release 3.1.0, using the same layout and same report naming as the legacy, but he had to reinvent the calculations.

Daily report update

In the end-of-day process, Cumulus output the monthly and yearly reports for the day that has just ended. This means when a new month, or new year, starts, the report is only available from the second day.

Although Steve Loft never shared his code for Cumulus 1, he did hint that his software retained content for the current reports, and that the daily update built on the existing report, just adding the new line, and updating the summary. Consequently, it seems his reports were constructed from very accurate data as each line was being made as data was read from the weather station during the day. He warned "Be aware that a regenerated report for a past period might not be quite as accurate as the report that Cumulus can generate as part of end of day processing", confirming that the processes for originally producing the report and for subsequently regenerating past reports were different. For regenerating past reports, all the data for the reports was calculated from processing standard log file entries (these hold periodic spot values that typically miss any extremes). Steve also warned that for reports produced normally (in end of day process) you would make the lines incompatible by any changes you make to settings, such as thresholds. The legacy Cumulus does not recalculate earlier information according to latest settings, but it shows the latest thresholds as if they apply to all information on the report.

The average wind speed used for NOAA reports was, by a bug, based on midnight to midnight days regardless of rollover time in use, for any reports produced by versions up to 1.9.4 build 1085 only:

  • From 1.9.4 build 1086, the calculation is correctly based on the rollover time being used.

Mark Crossley does share the source code for MX, this shows that MX creates the monthly and yearly reports afresh each day, reading information from dayfile.txt for past days. If you change threshold settings for degree days in MX, that only affects subsequent daily summary log entries, so again although the report shows the latest thresholds, these might not apply to all information on the report. Mark's monthly report uses:

  1. Daily average temperature (depends on release)
  2. Highest daily temperature and time (taken from fields 7 and 8 of the daily summary log)
  3. Lowest daily temperature and time (taken from fields 5 and 6 of the daily summary log)
  4. Heating degree days (taken from field 41 of the daily summary log)
  5. Cooling degree days (taken from field 42 of the daily summary log)
  6. Daily Rainfall (taken from field 15 of the daily summary log)
  7. Average Wind Speed (calculated from field 17 of daily summary log, by dividing by 24 - the number of hours in a day)
  8. Highest daily average wind speed and time (taken from fields 18 and 19 of the daily summary log)
  9. Dominant Direction (taken from field 40 of the daily summary log)

For releases 3.1.0 to 3.11.4, daily average temperature is taken from field 16 of daily summary log file From release 3.12.0, there is a choice between that integrated average (calculated from every temperature reading processed), and the WMO average based on limited manual observations (here calculated from adding fields 5 and 7, then dividing by 2)

For all flavours of Cumulus, the summary at the base of the monthly report is calculated as follows:

  • the arithmetic average is calculated, and shown, for Mean Temp column and Wind Speed column
  • the total is calculated, and shown, for the Rain column
  • the highest figure in column above is found, and shown, for all columns headed High or Low
    • the number in time columns represents the day number where the High/Low figure shown was reported
  • There are 4 thresholds for temperature, and the number of days below or above that threshold is counted
  • For each rain threshold, a count is made of days with rainfall above that figure


Annual Report

The annual report has a number of tables. Within each table a line appears for each month available for the year.

That line repeats some of the information appearing in the summary line of the respective monthly report.


Temperature table

In the settings, you can enter a month by month normal for temperature, and this report calculates, and shows, the divergence from that normal.

In the summary line:

  • for the the Mean Max, Mean Min, and Dep. From Norm columns, the figure shown is the arithmetic average of figures above
  • for the mean column,
    • from release 3.4.4, the figure shown is the mean of all entries for this year in field 16 of daily summary log file
    • in earlier MX releases, and the legacy Cumulus, the figure shown is the arithmetic average of figures above
  • totals of figures above are shown for the Heat Deg Days, Cool Deg Days, and Max/Min comparisons with thresholds

Precipitation table

Similar to previous table, information shown represents monthly figures and divergences from normal. All should be self-explanatory. The month with most rain is shown in the summary line.

Wind Speed table

The simplest table, it shows average and highest for each month, together with dominant direction for each month. The summary line is based on the information in each column.

Note: the annual average wind speed, from release 3.4.4 is the mean of all entries for this year in field 18 of daily summary log file; but for the legacy Cumulus, and for releases 3.1.0 to 3.4.3, the figure shown is the arithmetic average of figures above

Format of reports

All reports are pure text files. However, they contain more than just A to Z, 0 to 9, and punctuation. The additional symbol characters included in the report mean that it is important that the report and whatever is being used to view it are set to use the same set of character codes, the way that binary representations of characters are made depends on encoding.

Encoding

When Steve Loft started writing Cumulus to run in a Microsoft environment,he selected the character set now defined as iso-8859-1 which was used by Microsoft products at the time like Excel, Notepad. So Steve ensured all Cumulus Files used that encoding. Subsequently, Steve Loft discovered that modern web pages use UTF-8 encoding.

In April 2014, Steve introduced the choice in Cumulus 1 of either ISO-8859-1 encoding (as he used originally) or UTF-8 encoding (what he migrated his web page templates to) for these reports. For backwards compliance, the default encoding for reports selected by Steve Loft is his original ISO-8859-1 encoding, but his recommendation strongly expressed was that users should switch to UTF-8.

NOAA reports could be viewed externally using Microsoft Notepad (in the past, that defaulted to iso-8859-1 encoding) so Cumulus users were happy with reports in the default encoding.

In Cumulus 1, NOAA reports could be viewed by a selection from the View menu. This internal viewer could look up which encoding had been selected, and therefore could ensure the reports were displayed correctly.

Consistency for encoding

To add just a little more detail here, if you choose to implement a web page to display these Cumulus reports, then the HTML of the web page to display the report, the JavaScript that selects which report to show, and inserts the report into the HTML, and the report itself must all use the same encoding, otherwise you will not get characters like the degree symbol ° displaying correctly.

Steve Loft's software packages did not supply a web page for viewing these reports on web servers. However, when third party web pages for viewing the reports were made available, people started seeing strange characters in their reports. The authors of the new web pages could choose which encoding to use, but found whichever they selected, some potential users had their reports encoded with the other!

How did MX complicate encodings?

MX initially complicated the issue.

It provided a web page for displaying reports (/CumulusMX/interface/noaayearreport.html), as part of its admin interface. This web page includes <meta charset="utf-8"> to set the encoding.

For releases 3.1.0 to 3.9.6, MX maintained consistency with the legacy Cumulus by having the default encoding for reports set at ISO-8859-1. On the web page for NOAA settings (/CumulusMX/interface/noaasettings.html), the hint made no mention that the default encoding was inconsistent with the viewer!

From release 3.10.0, MX has made UTF-8 encoding the default for these reports, and the default web pages provided, from that release onwards, include one for showing reports. Further consistency improvements are made from release 3.12.0.

What encoding does my web page use?

Put simply, most modern web pages start with this:

<!DOCTYPE html>
<!-- the above must be on the first line by itself and tells the browser that HTML 5 applies -->
<html lang="en"><!-- modify this to indicate your language -->
<head>
	<meta charset="UTF-8"><!-- assigns the recommended standard encoding that copes with all international characters -->
...

The last line shown there is critical, it indicates that the web page uses "utf-8" encoding.

You will find that all standard web templates included with MX start as shown above. For Cumulus 1, from build 1094 up the various builds defined for final release, the above code is used. However for earlier builds of Cumulus 1, the standard web pages start as follows:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

<head>
<meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type" />

The last line there shows how the original web templates (designed by Beth Loft) used the ISO 8859-1 character set. Consequently, the original NOAA reports used ISO-8859-1 encoding and for compatibility with this original setting, the default encoding for NOAA reports is unchanged despite the mismatch with web pages, because Cumulus 1 does not contain any web page to display NOAA reports.


TECHNICAL BIT

With that introduction, you can now choose whether to read the rest of this section which uses more technical terminology.

Let me explain that technical term, essentially encoding refers to the character set used by any file.

A computer uses binary, binary can only be in state 0 or state 1, so a combination of 0 and 1 states needs to be defined for every character you want to represent. What you can include in that character set depends to some extent on how many binary bits are used to be mapped to individual characters; and if more than one byte worth of bits is used the order in which the bits within the multiple bytes are used must be defined for each particular encoding.

With any fixed number of bits available, there will be a limit to how many characters can be defined, and different organisations might select different characters to include. This is what leads to multiple encoding standards. One might use a particular arrangement of bits to represent the degree symbol, while another encoding uses that particular arrangement of bits for a different purpose. The general problem is that unless you match the encoding used initially, any retrieval cannot know what character to display for certain combinations of bits.

This means that when you read a file you probably find the letters A to Z where you expect them, but whether you see correct case cannot be guaranteed. Some encodings put capital letters at lower binary values than lower case letters, and some put capitals at higher binary values.

If you use 7 bits, you have 127 combinations, enough for standard 26 letters in both capitals, and lower case, plus 10 digits (0 to 9), some punctuation, and some control characters (like new line, end of file, and so on). If you use 8 bits, a whole byte, you have 254 combinations, and you can start coping with accented letters, with alphabets that don't have 26 letters, and even add some symbols. Obviously, once you start using more than one byte, you can have 16, 32, 64, or even more bits to use and can include lots more characters and the bigger character sets start including lots of symbols and the biggest add smilies or emotion icons.