ONature
Provincial Partner

What we do

Meetings
Annual lecture
Butterfly atlas
Moth Atlas
Moth Checklist
Contribute Records
Newsletter
Other Publications
Field trips
Insect counts
Student symposium
Research grant
Home

About the TEA

People
History
Rearing Permit
Membership / Donate

About insects

Insects of Ontario
Books
Endangered sp. / Laws
Butterfly Gardening
Links

Contact us

For Ontario Nature

Herp Atlas



Submitting Butterfly Records for the
TEA Seasonal Summaries and Atlas

by Rick Cavasin. For comments, contact Alan Macnaughton (info@ontarioinsects.org)

February 2022

Background

Over the years, the number of Butterfly observations that have been submitted on citizen science platforms like eButterfly and iNaturalist has ballooned. We are now getting 10s of thousands of observations every year (over 50,000 for each of 2020 and 2021). It's no longer feasible to process these observations "by hand". The only way we can ensure timely and accurate conversion of the data into the Atlas format is to perform this operation using software. In order to make this possible, the input to that software has to be in a consistent format.

For the citizen science platforms, this is straightforward, as we receive the data in a standardised format from each platform. Although the TEA published a spreadsheet template in the past, it allowed considerable latitude in how observers could fill it out, and some people went beyond that and customised the spreadsheet. As a result, previous Atlas compilers were required to spend a huge amount of their personal time converting these diverse formats into the format required by the Atlas. Not only is this process time consuming (and maddening), but it is also error prone. Even with all that work, we were still left with certain inconsistencies.

Many people are submitting their photo observations to iNaturalist, and the Ontario Butterfly Atlas periodically retrieves observations from iNaturalist for inclusion. For a number of reasons, we do not use observations from iNaturalist that do not include photographs (ie. "sight" records). If you do not want to use iNaturalist (or eButterfly), or if you want to submit your "sight" observations to the Atlas, we offer the option of submitting your observations in spreadsheet format.

The New Spreadsheet Template

Sample

Observers who wish to submit their observations in spreadsheet format must use the new template provided (see link above) and follow the guidelines outlined below. Most of it is fairly self explanatory and the template contains some sample entries. There is a small amount of flexibility - chiefly in terms of which columns you choose to use. Because the data is going to be processed using software, there is limited flexibility on what goes into those columns. You can't put just anything in there and expect the software to "understand". For example, you can ONLY use decimal latitude/longitude. While you can use either a Common Name or a Scientific Name, you must use the names in the TEAs published list (please pick either Common or Scientific and stick with it). I will ONLY accept subspecies names for Limenitis arthemis, and for Erynnis persius. You can include genus level observations, but for the most part, these will be discarded as the Atlas does not record genus level observations (with the exception of Celastrina sp. and larval/pupal Limenitis observations). For those who prefer to use Scientific names, and who prefer to use the most up-to-date taxonomy, some allowance will be made. You may use a more "up-to-date" scientific name if that name is currently used on iNaturalist. I don't have time to track down and verify every alternative name that's out there. If I don't recognize the scientific name you're using, your spreadsheet will be returned to you.

You will see that there are 3 sheets in the template - they should appear as tabs along the bottom of the spreadsheet. The first one is the actual submission template. The second sheet is the rules/suggestions about what can go in the various columns. The third sheet is the list of acceptable species names (Common and Scientific).

The column order can be changed, but the column names must NOT be altered, as the software needs those names in order to figure out what's in each column.

The following are either/or options: You can enter the observation date either in the Date column, or you can use the Year, Month, Day columns. Please don't use both, or alternate. Pick which column(s) you want to use, and stick with it. The sample data in the sheet shows a mix for illustrative purposes. You can enter either the Common Name or the Scientific Name but NOT BOTH. (the sample data has a mix, but you should pick one and stick with it). Again, you must use the names that are in our published list.

Note that information that repeats for multiple observations (things like the date, location, latitude, longitude, etc.) MUST be copied down the columns for all entries to which they apply. You can't leave cells blank and assume that the software is going to "understand" that the date/location from the previous line will apply. The software reads each row of the spreadsheet one by one, and treats each one as an independent observation (even if you are grouping them together as a "checklist" with a common date/location). There are easy shortcuts for copying multiple cells down the columns (some are outlined in a video linked further down this email).

Columns

For each column the following guidelines apply (you can see these guidelines in the guidelines tab at the bottom of the template):


Date
- the only format that will be accepted is YYYY-MM-DD, all arabic numerals (no alphabetic month names, no roman numerals)

Year
- 4 digit year (all Arabic numerals)

Month
- 1-2 digit month (no alphabetic month names, no Roman numerals)

Day
- 1-2 digit day (all Arabic numerals)

Location
- free form text place name - suggested format is to start from the left with a general location like a town/city/region, followed by something more specific (please don't include Square names). For example: name of nearest Town/City, followed by name of trail or other location details. Think names like "Toronto - Edward's Gardens" or "Algonquin PP - Old Airfield". The intent is to give an Atlas user a rough idea of where an observation was made, not to pinpoint it to within a few meters. In general, County names should not be included in location names. County names are computed by our software, and are stored in a separate field in the database.

Latitude
- ONLY decimal latitude will be accepted. Please don't truncate coordinates. Observations with coordinates that appear to have been truncated to fewer than 3 decimal places will be discarded.
Longitude - ONLY decimal longitude will be accepted. Please don't truncate coordinates. Observations with coordinates that appear to have been truncated to fewer than 3 decimal places will be discarded.

Coordinate Accuracy (optional)
- this is just a ballpark estimate of how far the Lat/Long might be from the "true" coordinates. Think of it in terms of drawing a circle around the coordinates you provided - how big a radius would you need to be sure you are encompassing all the observations to which the coordinates apply? (this mostly comes into play if you follow the "checklist" model of reporting outlined below). Note that you can use one Accuracy number for all the observations you made at a particular date/location.

Observers
- yes, we want your full name as you would like to have it appear in the Atlas. Please be consistent about how you spell your name. For people with names like "William" and "Robert", please decide if you want to go by the longer form (William/Robert) or the shorter form (Bill/Bob). Also, if you have a middle initial, please be consistent about whether or not you put a period after the initial. If you work exclusively by yourself, you can enter your name once in the first row, and we will assume that you were the only observer for all your observations. Otherwise, please include full names separated by commas, even for couples. Please don't use forms like "Bill and Freda Smith" or "Bill & Freda Smith". Instead, use "Bill Smith, Freda Smith". There's a good technical reason for doing it that way. Don't use initials in place of first names when listing additional observers!

Survey Remarks - (optional)
- use this for general info like habitat, weather, etc. that applies to all the observations you make at a given date/location. I will put this in the "Habitat" column in the Atlas. Don't include square numbers. We calculate those.

Common Name
- if you choose to specify species using its Common Name, it must be the Common Name the TEA is currently using (see the list in the 3rd sheet)

Scientific Name

- if you choose to specify species using its Scientific Name, it must be the Scientific Name the TEA is currently using, or the Scientific Name currently use on iNaturalist (see the list in the 3rd sheet)

#Adults
- use a number. The only other acceptable options are a number with a "+" sign suffix like "50+", or a number with a "~" prefix like "~50", and an "x" to simply indicate that the species was present. Words like "few" and "many" are vague and context dependent. If you saw the butterflies, you should be able to come up with a rough estimate of the number - it doesn't have to be exact. If you want to mention numbers of males/females, please do so in the Comments column. If you leave it blank, the value is assumed to be zero (for example, if you are reporting eggs/larvae/pupae).

#Eggs
- same rules as #Adults above.

#Larvae
- same rules as #Adults above.

#Pupae
- same rules as #Adults above.

Record type
- only use ONE value in this column. It can be one of the following: sight, photo, catch/release, specimen (use this exact spelling). If you saw 5 adults and photographed 1, put 5 in the #Adults column, and "photo" in this column even though you didn't photograph all 5.

Comments
- this is where you can put specific information that doesn't go into other columns. Please try to be brief and to the point. If you want to store extended information for your own records, please consider adding an extra column for those, and deleting it from the copy of the spreadsheet that you submit to the TEA.

Also reported on
- use this optional column to indicate if you have reported the observation on some other platform like iNaturalist or eButterfly. You can enter the Observation/Checklist numbers in the following columns, but if that's too much bother, at least indicate that this observation has been reported elsewhere so that when I find observations that look like they are duplicates, I don't have to hum and haw over it.

iNat Observation ID - (optional)
- if you reported the same observation on iNaturalist, you can put the iNat Observation ID here. That will make matching up duplicate records much easier for me. eB checklist number - (optional) - if you reported this observation on eButterfly, you can put the eButterfly checklist number here (this number is more important than the eB Observation ID below. We can access a checklist without the Observation ID, but not vice versa) This will make matching up duplicate records much easier for me.

eB Observation ID - (optional)
- if you reported the same observation on eButterfly, you can put the eButterfly Observation ID here. This will make matching up duplicate records much easier for me

Comments

Note that the order of the columns can be changed, and if there's an optional column that you don't think you will ever use, you can delete it from your spreadsheet (though hiding it is probably safer). For example, if you are always going to enter your observation dates using the Year, Month and Day columns, feel free to delete the Date column. The one thing you absolutely cannot change is the spelling of the column headings. The software used by the TEA will read in the spreadsheet header, and use the column names to figure out what's in each column. It will do this by searching for the original column names as they appear in the template. If an optional column is completely missing, our software will handle it. If you delete a non-optional column, things aren't going to work, so it might be safer to just "hide" the columns you don't want to use.

As a general rule, we want to discourage third party reporting. More often than not, the observation gets reported by the original observer (or multiple third parties!), with a different set of Lat/Long coordinates, resulting in duplicate observations. Even if this doesn't happen, it is very difficult to verify anything about these observations, or to ask any questions if something doesn't look quite right. As far as we're concerned, third party reports are just hearsay, and the Atlas already contains enough dubious and unverifiable observations. As a result, we're going to ignore any observations that appear to have been reported on behalf of someone else.

Some of you who are spreadsheet savvy may decide you want to fancy up my simple spreadsheet. That's fine, as long as you don't change the heading names. If you add formulae or anything like that, please strip them out before you submit your spreadsheet, or simply export your spreadsheet in CSV format before sending it in.

If these restrictions seem draconian, there are good reasons for all of them which we would be happy to explain.

The following is a suggestion for how to speed up your data entry (if you have trouble understanding this, feel free to contact me for clarification): The order of the columns is set up to facilitate chronological data entry. The idea is that you enter all the observations for an outing (or survey) at the same time. The very best time to do this is soon after your outing, rather than at the end of the season. The idea is similar to the "checklist" model used in eButterfly, where you report a series of observations for one location/date together as a group. You enter your date and location information (the info that is common to all the observations you want to group into a checklist) into one row, and then copy that information down for as many rows as you have observations to report. Then you fill in the observation specific information (the pink columns) on a row by row basis. The coordinate accuracy is just an estimate of how far off the Lat/Long are from where the observations were made (ie. the maximum error - use one value for all the observations at that location). Take a look at the sample data in the template to get an idea.

The way I fill this spreadsheet in is to group together observations that were recorded within a reasonable distance of each other - as one would for a checklist on eButterfly. If I observe 10 species along a 5km hike, I enter the Lat/Long coordinates of the approximate midpoint of the hike, and copy them for all 10 species I observed, and I set the Coordinate accuracy to something like 2500, which would be the maximum distance any individual observation might be from the midpoint of my hike. Think of the Accuracy as the radius of a circle around the Lat/Long coordinates that encompasses all your observations. This "checklist" mode of entering observations cuts down the amount of time required to enter your observations into the spreadsheet since you only have to enter the values in the yellow columns once for all the observations you made for a particular date/location (and then you copy those values for the rest of the observations in the checklist). For tips on how to quickly copy the content from one row of a spreadsheet onto a number of rows, have a look at this video: https://www.youtube.com/watch?v=u1rp9nzLQw8 There are probably lots of other videos on youtube that explain this in other ways, along with other tips on spreadsheet data entry.

If you prefer to enter more exact coordinates for each individual observation, you can do that. Just be aware that the software that will process your observations will check the proximity of observations for the same species, and multiple observations of the same species reported within a few hundred metres of each other may be represented by a single observation in the Atlas. Also, keep in mind that observations with an accuracy figure greater than 10km will be excluded from the Atlas altogether. We recommend grouping together observations that were made a short distance apart, but please keep it within a radius of a few kilometres maximum.

Another suggestion for those who report observations made over extended distances - you might want to take Atlas square boundaries into account, and split your checklists so that your observations get reported for the correct Atlas square. There are a number of ways you can do this, and detailed instructions are beyond the scope of this set of suggestions.

If you are unsure of whether or not you are filling in the template properly, feel free to submit a sample after you have filled in a few entries, and we will let you know if there are any problems with what you're doing.

For the 2022 year, we'd like to get as many submissions as possible BEFORE Dec. 1 so that we can put out the 2022 seasonal summary in a more timely fashion. But you can submit your observations (including observations from past years) at any time for inclusion in the Atlas.