请输入您要查询的百科知识:

 

词条 Newick format
释义

  1. Examples

  2. Rooted, unrooted, and binary trees

  3. Grammar

      The grammar nodes    The grammar rules  

  4. See also

  5. References

  6. External links

{{Infobox file format
| name = Newick format
| icon =
| iconcaption =
| screenshot =
| caption =
|_noextcode = on
| extensions =
|_nomimecode = on
| mime =
| type_code =
| uniform_type =
| conforms_to =
| magic =
| developer =
| released = {{start date and age|df=yes|paren=yes|1986|06|24}}
| latest_release_version =
| latest_release_date =
| genre = graph-theoretical trees
| container_for =
| contained_by =
| extended_from =
| extended_to =
| standard =
| free = Yes
| url =
}}

In mathematics, Newick tree format (or Newick notation or New Hampshire tree format) is a way of representing graph-theoretical trees with edge lengths using parentheses and commas. It was adopted by James Archie, William H. E. Day, Joseph Felsenstein, Wayne Maddison, Christopher Meacham, F. James Rohlf, and David Swofford, at two meetings in 1986, the second of which was at Newick's restaurant in Dover, New Hampshire, US. The adopted format is a generalization of the format developed by Meacham in 1984 for the first tree-drawing programs in Felsenstein's PHYLIP package.[1]

Examples

The following tree:

could be represented in Newick format in several ways

 ({{Not a typo|,,}}(,));                               ''no nodes are named'' (A,B,(C,D));                           ''leaf nodes are named'' (A,B,(C,D)E)F;                         ''all nodes are named'' (:0.1,:0.2,(:0.3,:0.4):0.5);           ''all but root node have a distance to parent'' (:0.1,:0.2,(:0.3,:0.4):0.5):0.0;       ''all have a distance to parent'' (A:0.1,B:0.2,(C:0.3,D:0.4):0.5);       ''distances and leaf names'' '''(popular)''' (A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F;     ''distances and all names'' ((B:0.2,(C:0.3,D:0.4)E:0.5)A:0.1)F;    ''a tree rooted on a leaf node'' '''(rare)'''

Newick format is typically used for tools like PHYLIP and is a minimal definition for a phylogenetic tree.

Rooted, unrooted, and binary trees

When an unrooted tree is represented in Newick notation, an arbitrary node is chosen as its root. Whether rooted or unrooted, typically a tree's representation is rooted on an internal node and it is rare (but legal) to root a tree on a leaf node.

A rooted binary tree that is rooted on an internal node has exactly two immediate descendant nodes for each internal node.

An unrooted binary tree that is rooted on an arbitrary internal node has exactly three immediate descendant nodes for the root node, and each other internal node has exactly two immediate descendant nodes.

A binary tree rooted from a leaf has at most one immediate descendant node for the root node, and each internal node has exactly two immediate descendant nodes.

Grammar

A grammar for parsing the Newick format:

The grammar nodes

    '''Tree''': The full input Newick Format for a single tree    '''Subtree''': an internal node (and its descendants) or a leaf node    '''Leaf''': a node with no descendants    '''Internal''': a node and its one or more descendants    '''BranchSet''': a set of one or more Branches    '''Branch''': a tree edge and its descendant subtree.    '''Name''': the name of a node    '''Length''': the length of a tree edge.

The grammar rules

Note, "|" separates alternatives.

    '''Tree''' → '''Subtree''' ";" | '''Branch''' ";"    '''Subtree''' → '''Leaf''' | '''Internal'''    '''Leaf''' → '''Name'''    '''Internal''' → "(" '''BranchSet''' ")" '''Name'''    '''BranchSet''' → '''Branch''' | '''Branch''' "," '''BranchSet'''    '''Branch''' → '''Subtree''' '''Length'''    '''Name''' → ''empty'' | ''string''    '''Length''' → ''empty'' | ":" ''number''

Whitespace (spaces, tabs, carriage returns, and linefeeds) within number is prohibited. Whitespace within string is often prohibited. Whitespace elsewhere is ignored. Sometimes the Name string must be of a specified fixed length; otherwise the punctuation characters from the grammar (semicolon, parentheses, comma, and colon) are prohibited. The Tree --> Branch ";" production makes the entire tree descendant from nowhere, which can be nonsensical, and is sometimes prohibited.

Note that when a tree having more than one leaf is rooted from one of its leaves, a representation that is rarely seen in practice, the root leaf is characterized as an Internal node by the above grammar. Generally, a root node labeled as Internal should be construed as a leaf if and only if it has exactly one Branch in its BranchSet. One can make a grammar that formalizes this distinction by replacing the above Tree production rule with

    '''Tree''' → '''RootLeaf''' ";" | '''RootInternal''' ";" | '''Branch''' ";"    '''RootLeaf''' → '''Name''' | "(" '''Branch''' ")" '''Name'''    '''RootInternal''' → "(" '''Branch''' "," '''BranchSet''' ")" '''Name'''

The first RootLeaf production is for a tree with exactly one leaf. The second RootLeaf production is for rooting a tree from one of its two or more leaves.

See also

  • phyloXML
  • T-REX (Webserver) allows handling phylogenetic trees and networks in the Newick format.

References

1. ^The Newick tree format.

External links

  • Gary Olsen's Interpretation of the "Newick's 8:45" Tree Format Standard  
  • Miyamoto and Goodman's Phylogram of Eutherian Mammals An example of a large phylogram with its Newick format representation.
{{Graph representations}}

3 : Trees (data structures)|Graph description languages|Phylogenetics

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/9/26 4:23:08