CARO names for PhVx
hexfiles issue 1
One thing we, who are interested in virii, are very concerned about is the way AVs name virii.
In the early 90's, a guideline for naming virii was formulated, the CARO virus naming convention. People thought that this will bring order to the chaos. Probably due to lack of cooperation and coordination, the AVs went on their own way in naming virii. Old virii whose CARO name had been already set were still changed by AVs to suit their own ends. This resulted to virii having several CARO names.
If the leading AVs do not agree with each other in naming virii, what more of Philippine AVs who are not aware of the CARO guidelines. Of the three local AVs of note, only one is moving towards compliance to the CARO guidelines, in its own way. Oftentimes, the three AVs do not use the same name for a PhVx.
This is what HEX-FILES wants to straighten out and hopes to see uniformity in the names used for PhVx at least in the Philippines if not in the global AV and VX community.
I formulated a simplified guideline in naming PhVx. This is based on and conforms to the CARO guidelines. A PhVx is named in the format:
Family_Name.Group_Name.Major_Variant.Minor_Variant
1. Family_Name
The new PhVx is only given a new Family_Name if it is found not to belong to an existing Family_Name.
2. Group_Name
The new PhVx is only given a new Group_Name if it is found not to belong to an existing Group_Name.
If the existing members of Family_Name are not grouped into Group_Name and the new PhVx does not differ in structure to existing members, Group_Name can be omitted and members of Family_Name are instead identified by Major_Variant.
However, if the new PhVx differs with other members of Family_Name, existing members of the Family_Name are given the "Standard" Group_Name if no descriptive name can be arrived at. The new PhVx is given an appropriate Group_Name.
If Family_Name is segregated into groups, all members of Family_Name should be a member of a Group_Name.
3. Major_Variant
The new PhVx Major_Variant is often its infective length. The Major_Variant can be omitted if there is only one member of Group_Name, or Family_Name, as the case may be.
4. Minor_Variant
The new PhVx is only given a Minor_Variant if there exists at least two members of Group_Name (or Family_Name as the case may be) having the same Major_Variant.
I came up with names for existing PhVx based on the above guidelines. Names currently used by AVs were considered. Where the current name used is not acceptable, a new name descriptive of the PhVx is used.
Everybody is enjoined to use the following:
Cara.Standard.xxxx Cara.Kara.xxxx | Cara is a universally accepted name. However, with the release of Cara.Kara in HEX-FILES, it is only proper to separate them into groups. |
Danao | Xed would have been fine if there was no Jerusalem.Bad_Illusion which displays this text. To avoid confusion, the name DisCoVir uses is adapted. |
Jerusalem.Bad_Illusion | It is accepted that this is a member of Jerusalem family. However, I could not accept 1735 (not descriptive enough) or Xed (to avoid confusion) as Group_Name. It displays the text "Bad Illusion". |
June12.xxxx | This is the first name for the virus. AVs chose Mabuhay instead because activation date can be easily changed. Text can be easily changed also, encrypted or not. |
Microbe | Universally accepted, but drop the 's'. |
Msu | This is a more descriptive name for the virus as it covers the two existing variants, which based on virus descriptions, originated from two campuses of MSU - Iligan City and Marawi City. |
Oggo | This is what the author/s wanted it called, so let's give them credit. |
Pempe | Universally accepted. |
Possessed.xxxx | Universally accepted. Could have been grouped separately but for lack of a descriptive name. |
Quaint | I do not how DisCoVir came up with that name. Due to lack of a descriptive name, it is accepted. |
Rebolusyon | This is more descriptive than Qark, Quark or any other name. Written by someone who symphatize with the revolutionary movement and this is a text displayed by the virus. Name first used by FVR, former name of ViRem. |
Sampo | Universally accepted. |
Tadpole | Universally accepted, but drop the 's'. |
Wpc_Bats.Ala-Eh.xxxx Wpc_Bats.Lipa.xxxx | This is the Family_Name most AVs use. Although Ala-Eh and Lipa are almost similar, grouping them separately is necessary for clarity. |
Xtac | Universally accepted. |
Send your reactions, comments and suggestions to phvx@hotmail.com
Õ o Õ
The following is the CARO guidelines for naming virii. The complete file (naming.zip) is available at http://www.chibacity.com/chiba/
A New Virus Naming Convention
At a CARO meeting in 1991, a committee was formed with the objective of reducing the confusion in virus naming. This committee consisted of Fridrik Skulason (Virus Bulletin's technical editor) Alan Solomon (S&S International) and Vesselin Bontchev (University of Hamburg).
The following naming convention was chosen:
The full name of a virus consists of up to four parts, desimited by points ('.'). Any part may be missing, but at least one must be present. The general format is
Family_Name.Group_Name.Major_Variant.Minor_Variant[:Modifier]
Each part is an identifier, constructed with the characters [A-Za-z0-9_$%&!'`#-]. The non-alphanumeric characters are permitted, but should be avoided. The identifier is case-insensitive, but mixed-case characters should be used for readability. Usage of underscore ('_') (instead of space) is permitted (and even encouraged), if it improves readability. Each part is up to 20 characters long (in order to allow such monstriosities like "Green_Caterpillar"), but shorter names should be used whenever possible. However, if the shorter name is just an abbreviation of the long name, it's better to use the long name.
1. Family names.
The Family_Name represents the family to which the virus belongs. Every attempt is made to group the existing viruses into families, depending on the structural similarities of the viruses, but we understand that a formal definition of a family is impossible.
When selecting a Family_Name, the following guidelines must be applied:
"Must"
1) Do not use company names, brand names, or names of living people, except where the virus is provably written by the person. Common first names are permissible, but be careful - avoid if possible. In particular, avoid names associated with the anti-virus world. If a virus claims to be written by a particular person or company do not believe it without further proof.
2) Do not use an existing Family_Name, unless the viruses belong to the same family.
3) Do not invent a new name if there is an existing, acceptable name.
4) Do not use obscene or offensive names.
5) Do not assume that just because an infected sample arrives with a particular name, that the virus has that name.
6) Avoid numeric Family_Names like V845. They should never be used as family names, as the members of the family may have different lengths. When a new virus appears and a new Family_Name must be selected for it, it is acceptable to use a temporary name like _1234, but this must be changed as soon as possible.
"Should"
1) Avoid Family_Names like Friday 13th, September 22nd. They should not be used as family names, as members of the family may have different activation dates.
2) Avoid geographic names which are based on the discovery site - the same virus might appear simultaneously in several different places.
3) If multiple acceptable names exist, select the original one, the one used by the majority of existing anti-virus programs or the more descriptive one.
"General"
1) All short (100 bytes of code or less, messages excluded) overwriting viruses are grouped under a Family_Name, called Trivial. The variants in each family are named by their infective length.
2) The relatively small viruses which do nothing but replicate and which do not contain anything particular that can be used to name them, are grouped in the following six families:
SillyC - Non-resident viruses, which infect only COM files;
SillyE - Non-resident viruses, which infect only EXE files;
SillyCE - Non-resident viruses, which infect both types of files;
SillyRC - Resident viruses, which infect only COM files;
SillyRE - Resident viruses, which infect only EXE files;
SillyRCE - Resident viruses, which infect both types of files.
The variants in each family are named after their infective length.
3) The trivial boot and master boot sector viruses which do nothing but replicate are grouped in two families:
SillyP - Trivial master boot sector infectors
SillyB - Trivial DOS boot sector infectors
The variants in each family are named after the contents of the 2nd and the 3rd bytes of the infected boot sector in hexadecimal
4) All overwriting viruses written in a high-level programming language are grouped in a single family, called HLLO. The particular language used in the virus doesn't matter. The names of the variants in this family conform to the same rules as the Group names (see below).
5) All companion viruses written in a high-level programming language are grouped in a single family, called HLLC. The particular language used in the virus doesn't matter. The names of the variants in this family conform to the same rules as the Group names (see below).
2. Group names.
The Group_Name represents a major group of similar viruses in a virus family, something like a sub-family. Examples are AntiCAD (a distinguished clone of the Jerusalem family, containing numerous variants), or 1704 (a group of several virus variants in the Cascade family).
When selecting a Group_Name, the same guidelines as for a Family_Name should be applied, except that numeric names are more permissible - but only if the respective group of viruses is well known under this name.
3. Major variant name.
The major variant name is used to group viruses in a Group_Name, which are very similar, and usually have one and the same infective length. Again, the above guidelines are applied, with one major exception. The Major_Variant is almost always a number, representing the infective length, since it helps to distinguish that particular sub-group of viruses. The infective length should be used as Major_Variant name always when it is known. Exceptions of this rule are:
- When the infective length is not known, because the viruses are not yet analyzed. In this case, consecutive numbers are used (1, 2, 3, etc.). This should be changed as soon as more information about the viruses becomes known.
- When an alpha-numeric name of the virus sub-group already exists and is popular, or more descriptive.
4. Minor variant name.
Minor variants are viruses with the same infective length, with similar structure and behaviour, but slightly different. Usually the minor variants are different patches of one and the same virus.
When selecting a Minor_Variant name, usually consecutive letters of the alphabet are used (A, B, C, etc...). However, this is not a very hard restriction and longer names can be used as well, especially if the virus is already known under this (longer) name, or if the name is more descriptive than just a letter.
The producers of virus detection software are strongly usrged to use the virus names proposed here. The anti-virus researchers are advised to use the described guidelines when selecting names for new viruses, in order to avoid further confusion.
If a scanner is not able to distinguish between two minor variants of a virus, it should output the virus name up to the recognized major variant. For instance, if it cannot distinguish between Dark_Avenger.2000.Traveller.Copy and Dark_Avenger.Traveller.Zopy, it should report both variants of the virus as Dark_Avenger.Traveller.
If it is also not able to distinguish between the major variants, it should report the virus up to the recognized group name. That is, if the scanner cannot make the difference between Dark_Avenger.2000.Traveller.* and Dark_Avenger.2000.Die_Young, it should report all the variants as Dark_Avenger.2000.
At last, if the scanner is also unable to distinguish between the different groups, it should output only the family name of the virus (Dark_Avenger in our example).
5) Modifiers.
It is possible that a virus belongs to a particular family by its structure, but the virus writer has used some kind of concealing of this fact. Such concealing could be the conversion of the virus into a polymorphic one by linking one of the available polymorphic engines to it, or by compressing it with some executable-file compressor (e.g.,PKLite, LZEXE, etc.). The latter method is of concern only if the virus is able to spread in compressed form. Since one and the same virus could be concealed with different methods (or even with more than one method), this could cause classification confusion.
Such viruses should be classified as if the concealing mechanism has not been used, with a modifier appended to their name. This modifier indicates the particular concealing mechanism used. If the concealing tool conforms to a naming hierarchy, it's full name (e.g., TPE.1_3) should be used as a modifier. When the modifier indicates a compression tool, only the first two characters of the name of the tool should be used.
For instance, the Pogue virus is a member of the Gotcha family, but uses the MtE.0_90 polymorphic engine. Therefore, its full name should be "Gotcha.Pogue:MtE.0_90".
It is permitted to use more than one modifier in the full name of the virus, if the virus uses more than one concealing mechanism, e.g. "Civil_War.1234.A:TPE.1_3:MtE.1_00:PK".