When working with phylogenetic trees, you'll encounter two dominant file formats: Newick and NEXUS. Understanding these formats is essential for data exchange between different phylogenetic software.
Newick Format
The Newick format (also called New Hampshire format) is a simple, compact way to represent tree topology using nested parentheses.
Basic Syntax
- Parentheses
()define clades - Commas
,separate taxa or clades - Colons
:precede branch lengths - Semicolon
;terminates the tree
Examples
# Simple tree with 3 taxa (A,B,C); # Tree with topology: A is sister to (B,C) (A,(B,C)); # Tree with branch lengths (A:0.1,B:0.2,(C:0.3,D:0.4):0.5); # Tree with internal node labels (support values) (A:0.1,B:0.2,(C:0.3,D:0.4)90:0.5); # Rooted tree with branch lengths ((A:0.1,B:0.2):0.3,(C:0.3,D:0.4):0.5);
Reading Newick Trees
To read a Newick string, start from the innermost parentheses and work outward:
((A,B),(C,D)); Read as: - A and B form a clade - C and D form a clade - These two clades are sisters
Common File Extensions
Newick files commonly use: .nwk, .newick, .tre, .tree, or .phy
NEXUS Format
NEXUS is a more comprehensive format that can store trees, character matrices, and analysis settings in a single file.
Basic Structure
#NEXUS
BEGIN TAXA;
DIMENSIONS NTAX=4;
TAXLABELS
Taxon_A
Taxon_B
Taxon_C
Taxon_D
;
END;
BEGIN TREES;
TREE tree1 = ((Taxon_A,Taxon_B),(Taxon_C,Taxon_D));
END;
NEXUS Blocks
NEXUS files contain different blocks for different data types:
- TAXA: List of taxon names
- CHARACTERS/DATA: Character matrix (DNA, morphology)
- TREES: One or more phylogenetic trees
- ASSUMPTIONS: Character weights, type sets
- SETS: Taxon sets, character sets
- PAUP/MRBAYES: Software-specific commands
Character Matrix Example
#NEXUS
BEGIN DATA;
DIMENSIONS NTAX=4 NCHAR=10;
FORMAT DATATYPE=DNA GAP=- MISSING=?;
MATRIX
Taxon_A ATGCATGCAT
Taxon_B ATGCATGCAT
Taxon_C ATGGATGCAT
Taxon_D ATGGATGCGT
;
END;
BEGIN TREES;
TREE best = ((Taxon_A,Taxon_B),(Taxon_C,Taxon_D));
END;
Common File Extensions
NEXUS files commonly use: .nex, .nexus, .nxs
Comparison
| Feature | Newick | NEXUS |
|---|---|---|
| Complexity | Simple | Complex |
| Trees only | Yes | No (can include matrices) |
| Multiple trees | One per line | TREES block |
| Character data | No | Yes |
| Metadata | Limited | Extensive |
| Human readable | Yes | Yes |
Special Characters
Some characters require special handling in both formats:
- Spaces: Use underscores or quote taxon names
- Parentheses, commas, colons: Quote names containing these
- Apostrophes: Use double apostrophes inside quoted names
# Taxon names with spaces
('Homo sapiens','Pan troglodytes');
# Using underscores instead
(Homo_sapiens,Pan_troglodytes);
Converting Between Formats
Many tools can convert between Newick and NEXUS:
- PhyloVerse: Import either format, export both
- FigTree: Open and save in multiple formats
- R (ape package): read.tree(), read.nexus(), write.tree()
- Biopython: Phylo module handles both formats
Work with Any Format
PhyloVerse accepts Newick, NEXUS, and other common phylogenetic formats. Upload your files and start visualizing immediately.
Launch PhyloVerse