DNA database

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

A DNA database or DNA databank is a database of DNA profiles. A DNA database can be used in the analysis of genetic diseases, genetic fingerprinting for criminology, or genetic genealogy. DNA databases may be public or private, but the largest ones are national DNA databases.

When a match is made from a national DNA database to link a crime scene to an offender who has provided a DNA sample to a database that link is often referred to as a cold hit. A cold hit is of value in referring the police agency to a specific suspect but is of less evidential value than a DNA match made from outside the DNA database.[1]


Forensic DNA database

A centralised database for storing DNA profiles of individuals that enables searching and comparing of DNA samples collected from a crime scene against stored profiles. The most important function of the forensic database is to produce matches between the suspected individual and crime scene bio-markers, and then provides evidence to support criminal investigations, and also leads to identify potential suspects in the criminal investigation. Majority of the National DNA databases are used for forensic purposes.[2]

Genetic genealogy database

A genetic genealogy database is a DNA database of genealogical DNA test results. GenBank is a public genetic genealogy database that stores genome sequences submitted by many genetic genealogists. Until now, GenBank has contained large number of DNA sequences gained from more 140,000 registered organizations, and is updated everyday to ensure a uniform and comprehensive collection of sequence information. These databases are mainly obtained from individual laboratories or large-scale sequencing projects. The files stored in GenBank are divided into different groups, such as BCT (bacterial), VRL (viruses), PRI (primates)…etc. People can access GenBank from NCBI’s retrieval system, and then use “BLAST” function to identify a certain sequence within the GenBank or to find the similarities between two sequences.[3]

Medical DNA database

A medical DNA database is a DNA database of medically relevant genetic variations. It collects individual’s DNA which can reflect their medical records and lifestyle details. Through recording DNA profiles, scientists may find out the interactions between the genetic environment and occurrence of certain diseases (such as cardiovascular disease or cancer), and thus finding some new drugs or effective treatments in controlling these diseases. It is often collaborated with the National Health Service.[4]

National DNA databases

A national DNA database is a DNA database maintained by the government for storing DNA profiles of its population. Each DNA profile based on PCR and uses STR (Short Tandem Repeats) analysis. They are generally used for forensic purposes which includes searching and matching of DNA profiles of potential criminal suspects.[5]

In 2009 Interpol reported there were 54 Police national DNA databases in the world at the time and 26 more countries planned to start one.[6] In Europe Interpol reported there were 31 national DNA databases and six more planned.[6] The European Network of Forensic Science Institutes (ENFSI) DNA working group made 33 recommendations in 2014 for DNA database management and guidelines for auditing DNA databases.[7] Other countries have adopted privately developed DNA databases, such as Qatar, which has adopted Bode dbSEARCH.[8]

United Kingdom

The first national DNA database was established in April 1995 by the United Kingdom, called National DNA Database (NDNAD). By 2006, it had contained 2.7 million DNA profiles (about 5.2% of the UK population), as well as other information from individuals and crime scenes[9] and 5.7 million profiles by 2015.[10][11] The information is stored in the form of a digital code, which is based on the nomenclature of each STR.[12] In the UK, police have wide-ranging powers to take DNA samples and retain them if the subject is convicted of a recordable offence.[13][14] As the large amount of DNA profiles which have been stored in NDNAD, "cold hits" may happen during the DNA matching, which means finding an unexpected match between an individual's DNA profile and an unsolved crime-scene DNA profile.This can introduce a new suspect into the investigation, thus helping to solve the old cases.[15]

United States

The United States national DNA database is called Combined DNA Index System (CODIS). It is maintained at three levels: national, state and local. Each level implemented its own DNA index system. The national DNA index system (NDIS) allows DNA profiles to be exchanged and compared between participated laboratories nationally. Each state DNA index system (SDIS) allows DNA profiles to be exchanged and compared between the laboratories of various states and the local DNA index system (LDIS) allows DNA profiles collected at local sites and uploaded to SDIS and NDIS.

CODIS software integrates and connects all the DNA index systems at the three levels. CODIS is installed on each participated laboratory site and uses a standalone network known as Criminal Justice Information Systems Wide Area Network (CJIS WAN)[16][17] to connect to other laboratories.

As of 2011, over 9 million records were held within CODIS.[18] As of March 2011, 361,176 forensic profiles and 9,404,747 offender profiles have been accumulated,[19] making it the largest DNA database in the world. As of the same date, CODIS has produced over 138,700 matches to requests, assisting in more than 133,400 investigations.[20]

The growing public approval of DNA databases has seen the creation and expansion of many states' own DNA databases. California currently maintains the third largest DNA database in the world. Political measures such as California Proposition 69 (2004), which increased the scope of the DNA database, have already met with a significant increase in numbers of investigations aided.

In order to decrease the number of irrelevant matches at NDIS, the Convicted Offender Index requires all 13 CODIS STRs to be present for a profile upload. Forensic profiles only require 10 of the STRs to be present for an upload.


The Australian national DNA database is called the National Criminal Investigation DNA Database (NCIDD). By the start of 2013, it contained 718,462 DNA profiles.[21][22] The database uses 9 STR locations and a sex gene for analysis. NCIDD combines all forensic data, including DNA profiles, advanced bio-metrics or cold cases.


The Canadian national DNA database is called the National DNA Data Bank (NDDB) which was established in 1998 but first used in 2000.[23]

NDDB consists of two indexes: the Convicted Offender Index (COI) and National Crime Scene Index (CSI-nat). There is also the Local Crime Scene Index (CSI-loc) which is maintained by local laboratories but not NDDB as local DNA profiles do not meet NDDB collection criteria. Another National Crime Scene Index (CSI-nat) is a collection of three labs operated by Royal Canadian Mounted Police (RCMP), Laboratory Sciences Judiciary Medicine Legal (LSJML) and Center of Forensic Sciences (CFS).


The Israeli national DNA database is called the Israel Police DNA Index System (IPDIS)[24] which was established in 2007, and has a collection of more than 135,000 DNA profiles. The collection includes DNA profiles from suspected and accused persons and convicted offenders. The Israeli database also include an “elimination bank” of profiles from laboratory staff and other police personnel who may have contact with the forensic evidence in the course of their work.

In order to handle the high throughput processing and analysis of DNA samples from FTA cards, the Israeli Police DNA database has established a semi-automated program LIMS, which enables a small number of police to finish processing a large number of samples in a relatively small period of time, and it is also responsible for the future tracking of samples.


The Kuwaiti government passed a law in July 2015 requiring all citizens and permanent residents (4.2 million people) to have their DNA taken for a national database.[25] The reason for this law was security concerns after the ISIS suicide bombing of the Imam Sadiq mosque.[26] They planned to finish collecting the DNA by September 2016 which outside observers thought was optimistic.[27]


In 1998, the Forensic DNA Research Institute of Federal District Civil Police created DNA databases of sexual assault evidence.[28] In 2012, Brazil approved a national law establishing DNA databases at state and national levels regarding DNA typing of individuals convicted of violent crimes.[28] Following the decree of the Presidency of the Republic of Brazil in 2013, which regulates the 2012 law, Brazil began using CODIS in addition to the DNA databases of sexual assault evidence to solve sexual assault crimes in Brazil.[28]

Interpol DNA database

The Interpol DNA database is also used in criminal investigations. Interpol maintains an automated DNA database called DNA Gateway that contains DNA profiles submitted by member countries collected from crime scenes, missing persons, and unidentified bodies.[29] The DNA Gateway was established in 2002, and at the end of 2013, it had more than 140,000 DNA profiles from 69 member countries. Unlike other DNA databases, DNA Gateway is only used for information sharing and comparison, it does not link a DNA profile to any individual, and the physical or psychological conditions of an individual are not included in the database.[30]


[31] [32] DNA databases occupy more storage when compared to other non DNA databases due to enormous size of each DNA sequence. Every year DNA databases are growing exponentially. This posed a major challenge to storage, data transfer, retrieval and search. To address these challenges DNA databases are compressed to save storage space and bandwidth during the data transfers. They are decompressed during search and retrieval. Various compression algorithms are used to compress and decompress. The efficiency of any compression algorithm depends how well and fast it compresses and decompresses, which is generally measured in compression ratio. The greater the compression ratio, the better the efficiency of an algorithm. At the same time, the speed of compression and decompression are also considered for evaluation.

DNA sequences contain repetitions of A, C, T, G in the form of palindrome. Compression of sequence involves searching and encoding these repetitions and decoding them when decompressed.

Some of the several encoding approaches used to encode and decode are

1) Huffman Encoding

2) Adaptive Huffman Encoding

3) Arithmetic coding

4) Arithmetic adaptive coding

5) Context tree weighted method

Few of the compression algorithms listed below use the one of the above encoding approaches to compress and decompress DNA database


2) RLZ

3) GenCompress

4) BioCompress

5) DNACompress


DNA databases and medicine

Many countries collect newborn blood samples to screen for diseases mainly with a genetic basis. Mainly these are destroyed soon after testing. In some countries the dried blood (and the DNA) is retained for later testing.

In Denmark the Danish Newborn Screening Biobank at Statens Serum Institut keeps a blood sample from people born after 1981. The purpose is to test for phenylketonuria and other diseases.[33] However, it is also used for DNA profiling to identify deceased and suspected criminals.[34] Parents can request that the blood sample of their newborn be destroyed after the result of the test is known.

Privacy issues

Critics of DNA databases warn that the various uses of the technology can pose a threat to individual civil liberties.[35][36] Personal information included in genetic material, such as markers that identify various genetic diseases and behavioral traits, could be used for discriminatory profiling and its collection may constitute an invasion of privacy.[37] Also, DNA can be used to establish paternity and whether or not a child is adopted. Nowadays, the privacy and security issues of DNA database has caused huge attention. Some people are afraid that their personal DNA information will be let out easily, others may define their DNA profiles recording in the Databases as a sense of "criminal", and being falsely accused in a crime can lead to having a "criminal" record for the rest of their lives.

UK laws in 2001 and 2003 allowed DNA profiles to be taken immediately after a person was arrested and kept in a Database even if the suspect was later acquitted.[38] In response to public unease at these provisions,[38] the UK later changed this by passing the Protection of Freedoms Act 2012 which required that those suspects not charged or found not guilty would have their DNA data deleted from the Database.[39]

In European countries which have established a DNA database, there are some measures which are being used to protect the privacy of individuals, more specifically, some criteria to help removing the DNA profiles from the databases. Among the 22 European countries which have been analyzed, most of the countries will record the DNA profiles of suspects or those who have committed serious crimes. Within some countries like the Netherlands, individual's DNA profile can only be taken if it can help to solve the criminal case, which means that a convicted person's DNA profile will not be stored into the DNA database.[40] For some countries (like Belgium and France) may remove the criminal’s profile after 30–40 years, because these “criminal investigation” database are no longer needed. Most of the countries will delete the suspect’s profile after they are acquitted…etc. All the countries have a completed legislation to largely avoid the privacy issues which may occur during the use of DNA database.[41]

Privacy issues surrounding DNA databases not only means privacy is threatened in collecting and analyzing DNA samples, it also exists in protecting and storing this important personal information. As the DNA profiles can be stored indefinitely in DNA database, it has raised concerns that these DNA samples can be used for new and unidentified purposes.[42] With the increase of the users who access the DNA database, people are worried about their information being let out or shared inappropriately, for example, their DNA profile may be shared with others such as law enforcement agencies or countries without individual consent.[43]

The application of DNA databases have been expanded into two controversial areas: arrestees and familial searching. An arrestee is a person arrested for a crime and who has not yet been convicted for that offense. Currently, 21 states in the a United States have passed legislation that allows law enforcement to take DNA from an arrestee and enter it into the state's CODIS DNA database to see if that person has a criminal record or can be linked to any unsolved crimes. In familial searching, the DNA database is used to look for partial matches that would be expected between close family members. This technology can be used to link crimes to the family members of suspects and thereby help identify a suspect when the perpetrator has no DNA sample in the database.[44][45]

DNA collection and human rights

In a judgement in December 2008, the European Court of Human Rights ruled that two British men should not have had their DNA and fingerprints retained by police saying that retention "could not be regarded as necessary in a democratic society".[46]

The DNA fingerprinting pioneer Professor Sir Alec Jeffreys condemned UK government plans to keep the genetic details of hundreds of thousands of innocent people in England and Wales for up to 12 years. Jeffreys said he was "disappointed" with the proposals, which came after a European court ruled that the current policy breaches people's right to privacy. Jefferys said "It seems to be as about as minimal a response to the European court of human rights judgment as one could conceive. There is a presumption not of innocence but of future guilt here … which I find very disturbing indeed".[47]


  1. Rose & Goos "DNA - A Practical Guide" (Carswell Publications, Toronto).
  2. Santos, F., Machado, H., & Silva, S. (2013). Forensic DNA databases in European countries: is size linked to performance?. Life Sciences Society and Policy, 9(1), 12.
  3. Benson, D. A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., & Sayers, E. W. (2012). GenBank. Nucleic acids research, gks1195.
  4. Hagmann, M. (2000). UK plans major medical DNA database. Science, 287(5456), 1184-1184.
  5. Butler, J. M. (2011). Advanced Topics in Forensic DNA Typing: Methodology: Methodology. Academic Press.
  6. 6.0 6.1 "Global DNA Profiling Survey; Results and Analysis" (PDF). Interpol DNA Unit. 2009. p. Appendix 1. Retrieved 12 October 2015.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  7. ENFSI DNA Working Group. (2010). DNA database management: Review and Recommendation. The Hague (The Netherlands): ENFSI.
  8. [1]>
  9. Linacre, A. (2003). The UK National DNA Database. The Lancet, 361(9372), 1841-1842.
  10. "National DNA Database statistics, Q1 2015 to 2016". National DNA Database statistics. UK Government Home Office. Retrieved 11 October 2015.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  11. NPIA UK (Communications), Gav Ireland, Simon Lewis, Dan Fookes (2012-03-31). "NPIA: Statistics". Npia.police.uk. Retrieved 2012-08-04. <templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  12. Gill, P. (2002). Role of short tandem repeat DNA in forensic casework in the UK-past, present, and future perspectives. Biotechniques, 32(2), 366-385.
  13. Bowcott, Owen (13 May 2015). "Retention of offenders' DNA profiles not illegal, supreme court rules". The Guardian. Retrieved 11 October 2015.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  14. Restrictions on use and destruction of fingerprints and samples
  15. Wallace, H. (2006). The UK national DNA database. EMBO reports, 7(1S), S26-S30.
  16. CODIS Brochure
  17. Butler, J. M. (2011). Advanced Topics in Forensic DNA Typing: Methodology: Methodology. Academic Press.
  18. CODIS - National DNA Index System
  19. CODIS - National DNA Index System
  20. Investigations Aided
  21. CrimTrac Biometric Services
  22. Mobbs, Jonathan D. "Crimtrac-technology and detection." 4th National Outlook Symposium on Crime in Australia, New Crimes or New Responses. Canberra. 2001.
  23. Milot, E., Lecomte, M. M., Germain, H., & Crispino, F. (2013). The national DNA data bank of Canada: a Quebecer perspective. Frontiers in genetics, 4.
  24. Zamir, A., Dell’Ariccia-Carmon, A., Zaken, N., & Oz, C. (2012). The Israel DNA database—The establishment of a rapid, semi-automated analysis system. Forensic Science International: Genetics, 6(2), 286-289.
  25. Visser, Nick (14 July 2015). "Kuwait To Institute Mandatory DNA Testing For All Residents". Huffington Post. Retrieved 10 October 2015.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  26. "ISIL claims responsibility for Kuwait Shia mosque blast". Al Jazeera. 27 June 2015. Retrieved 10 October 2015.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  27. Field, Dawn (3 September 2015). "Kuwait's war on ISIS and DNA". Oxford University Press Blog. Retrieved 10 October 2015.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  28. 28.0 28.1 28.2 Ferreira, Samuel T.G.; Paula, Karla A.; Maia, Flávia A.; Svidizinski, Arthur E.; Amaral, Marinã R.; Diniz, Silmara A.; Siqueira, Maria E.; Moraes, Adriana V. (2015). "The use of DNA database of biological evidence from sexual assaults in criminal investigations: A successful experience in Brasília, Brazil". Forensic Science International: Genetics Supplement Series. Elsevier Ireland Ltd.: 595–597.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  29. Interpol Forensics
  30. Forensics
  31. Ateet Mehta & Bankim Patel, et al., 2010, "DNA Compression using Hash Based Data Structure", International Journal of Information Technology and Knowledge Management July-December 2010, Volume 2, No. 2, pp. 383-386
  32. Kuruppu, S. S. (2012). Compression of Large DNA Databases (Doctoral dissertation, The University of Melbourne).
  33. http://web.archive.org/web/20141219123131/http://www.ssi.dk/sw62846.asp. Archived from the original on December 19, 2014. Retrieved December 19, 2014. Missing or empty |title= (help)<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  34. Berlingske Tidende, Sept 16 2007, "Blodbank som forbryderalbum"
  35. Jeffries, Stuart (27 October 2006). "Suspect nation". The Guardian.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  36. Lemieux, Scott (March 23, 2012). "Are Police Building a Massive DNA Database?". AlterNet.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  37. "DNA database 'breach of rights'". BBC News. 4 December 2008.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  38. 38.0 38.1 Wallace, H. M., Jackson, A. R., Gruber, J., & Thibedeau, A. D. (2014). Forensic DNA databases: Ethical and legal standards: A global review. Egyptian Journal of Forensic Sciences.
  39. "Protection of Freedoms Act 2012: DNA and fingerprint provisions". Protection of Freedoms Act 2012: how DNA and fingerprint evidence is protected in law. UK Government Home Office. 4 April 2014. Retrieved 11 October 2015.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  40. Martin, P. D., Schmitter, H., & Schneider, P. M. (2001). A brief history of the formation of DNA databases in forensic science within Europe. Forensic science international, 119(2), 225-231.
  41. Santos, F., Machado, H., & Silva, S. (2013). Forensic DNA databases in European countries: is size linked to performance?. Life Sciences Society and Policy, 9(1), 12.
  42. Roman-Santos, C. (2010). Concerns Associated with Expanding DNA Databases. Hastings Sci. & Tech. LJ, 2, 267.
  43. DNA databank proposal raises privacy concerns
  44. DNA Forensics
  45. Compulsory DNA Collection: A Fourth Amendment Analysis Congressional Research Service
  46. "UK | DNA database 'breach of rights'". BBC News. 2008-12-04. Retrieved 2012-08-04.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  47. James Sturcke (2009-05-07). "DNA pioneer condemns plans to retain data on innocent | Politics | guardian.co.uk". London: Guardian. Retrieved 2012-08-04.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>

fr:Base de données ADN

tr:DNA veri tabanı