Microsoft Excel is, by far, the most popular spreadsheet software in the world. One of the reasons why it’s so popular is that it has excellent formatting options, which make life easier for the user. However, Excel is not smart enough to know when to format a field, as scientists researching genes discovered recently.
The problem? Microsoft Excel thinks that the names of human genes are dates. A great example of that is the MARCH1 gene, which translates to Membrane Associated Ring-CH-Type Finger 1. As you probably guessed, Excel immediately replaces MARCH1 with 1-Mar (first of March) when the user leaves the field.
This created a lot of issues for researchers. Due to the Excel’s formatting options, a fifth of gene-related studies in 2016 came with errors. For those reasons, scientists agreed to change the names of specific genes to mitigate the issue. For example, now the previously-mentioned MARCH1 gene will be called MARCHF1.
The changes were made directly by the HUGO Gene Nomenclature Committee (HGNC). The HGNC published refreshed and extensive guidance for human gene nomenclature. Here is a short summary of the new guidelines:
- Each gene is assigned a unique symbol, HGNC ID, and descriptive name.
- Symbols contain only uppercase Latin letters and Arabic numerals.
- Symbols should not be the same as commonly used abbreviations.
- Nomenclature should not contain references to any species or ‘G’ for a gene.
- Nomenclature should not be offensive or pejorative.
Interestingly, Microsoft still didn’t comment on the issue. Still, experts don’t believe that the software company will do anything about the issue. Genetic data represents “quite a limited use case,” according to HGNC coordinator Elspeth Bruford, who shared her opinion with The Verge.