public class PhyloXMLEventReader extends AbstractXMLEventReader<PhyloXMLReaderStreamDataProvider> implements PhyloXMLConstants
Since trees are represented by a hierarchical structure of clade
tags in PhyloXML, they need
to be serialized to an event sequence according to the JPhyloIO grammar while reading. To achieve this,
node and edge information is buffered until a clade
end tag is reached. NodeEvent
s and
EdgeEvent
s representing all edges leading to children of this node are then fired, including all nested
metaevents. If custom XML is encountered before a clade
end, events are fired then to avoid buffering large
amounts of custom data. Since the predefined elements do not contain large amounts of data, buffering
this information does not make reading significantly more inefficient. Performance problems may occur if large
molecular sequences are attached to the phylogeny.
Custom XML is read in all positions, where no other element reader is registered. This includes custom elements nested under elements where this is illegal. Only registering custom element readers under tags, where this is valid, would require to buffer the whole custom contents.
Predefined data elements are represented as LiteralMetadataEvent
or ResourceMetadataEvent
with
specific internally used predicates. ResourceMetadataEvent
s may be used to group events representing attribute
values and element contents. Property
tags are represented by LiteralMetadataEvent
with the
value of the ref
attribute as a predicate. If other attributes are present or the value of the
applies_to
attribute indicates a different position than the element is actually found in, these
attribute values and the content are grouped by a ResourceMetadataEvent
. The content of a property
tag is translated to a Java object using classes of the type ObjectTranslator
.
Phylogenies in PhyloXML files can either be interpreted as phylogenetic trees or rooted networks, depending on
the value of the parameter ReadWriteParameterNames.KEY_PHYLOXML_CONSIDER_PHYLOGENY_AS_TREE
.
If it is interpreted as a network, edges defined by clade_rel
tags are represented by edge events with a
nested meta event with the predicate ReadWriteConstants.PREDICATE_IS_CROSSLINK
, otherwise they are
represented by meta events. By default this reader considers pyhlogenies a networks, if trees shall be read instead,
this parameter must be provided and set to true
.
Element IDs found in a PhyloXML document (specified as the value of the id_source
-attribute) are not the same
as the event IDs generated by JPhyloIO. If an element ID is encountered it is represented by a LiteralMetadataEvent
with the predicate ReadWriteConstants#PREDICATE_ATTR_ID_SOURCE
. If a reference to an element ID is encountered
(specified as the value of an id_ref
-attribute) it is represented by a LiteralMetadataEvent
with an
according predicate (e.g. ReadWriteConstants#PREDICATE_SEQUENCE_ATTR_ID_REF
) as well.
The mapping of id source values to event IDs can be obtained from the parameter map under the key
ReadWriteParameterNames.KEY_PHYLOXML_EVENT_ID_TRANSLATION_MAP
.
In case of the clade_rel
tag the JPhyloIO event IDs are given in additional LiteralMetadataEvent
s
with the predicates ReadWriteConstants.PREDICATE_EDGE_SOURCE_NODE
and ReadWriteConstants.PREDICATE_EDGE_TARGET_NODE
.
PhyloXMLConstants
,
Metadata demo applicationINTERNAL_USE_NAMESPACE, TAG_PARENT_OF_ROOT
APPLIES_TO_ANNOTATION, APPLIES_TO_CLADE, APPLIES_TO_NODE, APPLIES_TO_OTHER, APPLIES_TO_PARENT_BRANCH, APPLIES_TO_PHYLOGENY, ATTR_ABSENT_COUNT, ATTR_ALT_UNIT, ATTR_APPLIES_TO, ATTR_BRANCH_LENGTH, ATTR_BRANCH_LENGTH_UNIT, ATTR_COLLAPSE, ATTR_COMMENT, ATTR_CONFIDENCE, ATTR_DATATYPE, ATTR_DESC, ATTR_DISTANCE, ATTR_DOI, ATTR_EVIDENCE, ATTR_FROM, ATTR_GAINED_COUNT, ATTR_GEO_DATUM, ATTR_ID, ATTR_ID_PROVIDER, ATTR_ID_REF, ATTR_ID_REF_0, ATTR_ID_REF_1, ATTR_ID_SOURCE, ATTR_IS_ALIGNED, ATTR_LENGTH, ATTR_LOST_COUNT, ATTR_PRESENT_COUNT, ATTR_REF, ATTR_REROOTABLE, ATTR_ROOTED, ATTR_SOURCE, ATTR_STANDARD_DEVIATION, ATTR_TO, ATTR_TYPE, ATTR_UNIT, DATA_TYPE_BRANCH_COLOR, DATA_TYPE_EVENTTYPE, DATA_TYPE_RANK, DATA_TYPE_SEQUENCE_SYMBOL, DATA_TYPE_TAXONOMY_CODE, JPHYLOIO_PHYLOXML_NAMESPACE, PHYLOXML_DATA_TYPE_NAMESPACE, PHYLOXML_DEFAULT_PRE, PHYLOXML_FORMAT_NAME, PHYLOXML_NAMESPACE, PHYLOXML_PREDICATE_NAMESPACE, PHYLOXML_SCHEMA_LOCATION_URI, PREDICATE_ANNOTATION, PREDICATE_ANNOTATION_ATTR_EVIDENCE, PREDICATE_ANNOTATION_ATTR_REF, PREDICATE_ANNOTATION_ATTR_SOURCE, PREDICATE_ANNOTATION_ATTR_TYPE, PREDICATE_ANNOTATION_CONFIDENCE, PREDICATE_ANNOTATION_CONFIDENCE_ATTR_TYPE, PREDICATE_ANNOTATION_CONFIDENCE_VALUE, PREDICATE_ANNOTATION_DESC, PREDICATE_ANNOTATION_PROPERTY, PREDICATE_ANNOTATION_PROPERTY_ATTR_APPLIES_TO, PREDICATE_ANNOTATION_PROPERTY_ATTR_DATATYPE, PREDICATE_ANNOTATION_PROPERTY_ATTR_ID_REF, PREDICATE_ANNOTATION_PROPERTY_ATTR_UNIT, PREDICATE_ANNOTATION_URI, PREDICATE_ANNOTATION_URI_ATTR_DESC, PREDICATE_ANNOTATION_URI_ATTR_TYPE, PREDICATE_ANNOTATION_URI_VALUE, PREDICATE_ATTR_COLLAPSE, PREDICATE_ATTR_ID_SOURCE, PREDICATE_BINARY_CHARACTERS, PREDICATE_BINARY_CHARACTERS_ABSENT, PREDICATE_BINARY_CHARACTERS_ABSENT_BC, PREDICATE_BINARY_CHARACTERS_ATTR_ABSENT_COUNT, PREDICATE_BINARY_CHARACTERS_ATTR_GAINED_COUNT, PREDICATE_BINARY_CHARACTERS_ATTR_LOST_COUNT, PREDICATE_BINARY_CHARACTERS_ATTR_PRESENT_COUNT, PREDICATE_BINARY_CHARACTERS_ATTR_TYPE, PREDICATE_BINARY_CHARACTERS_GAINED, PREDICATE_BINARY_CHARACTERS_GAINED_BC, PREDICATE_BINARY_CHARACTERS_LOST, PREDICATE_BINARY_CHARACTERS_LOST_BC, PREDICATE_BINARY_CHARACTERS_PRESENT, PREDICATE_BINARY_CHARACTERS_PRESENT_BC, PREDICATE_CLADE_REL, PREDICATE_CLADE_REL_ATTR_DISTANCE, PREDICATE_CLADE_REL_ATTR_IDREF0, PREDICATE_CLADE_REL_ATTR_IDREF1, PREDICATE_CLADE_REL_ATTR_TYPE, PREDICATE_COLOR, PREDICATE_COLOR_ALPHA, PREDICATE_COLOR_BLUE, PREDICATE_COLOR_GREEN, PREDICATE_COLOR_RED, PREDICATE_CONFIDENCE, PREDICATE_CONFIDENCE_ATTR_STDDEV, PREDICATE_CONFIDENCE_ATTR_TYPE, PREDICATE_CONFIDENCE_VALUE, PREDICATE_DATE, PREDICATE_DATE_ATTR_UNIT, PREDICATE_DATE_DESC, PREDICATE_DATE_MAXIMUM, PREDICATE_DATE_MINIMUM, PREDICATE_DATE_VALUE, PREDICATE_DISTRIBUTION, PREDICATE_DISTRIBUTION_DESC, PREDICATE_DISTRIBUTION_POINT, PREDICATE_DISTRIBUTION_POINT_ALT, PREDICATE_DISTRIBUTION_POINT_ATTR_ALT_UNIT, PREDICATE_DISTRIBUTION_POINT_ATTR_GEODETIC_DATUM, PREDICATE_DISTRIBUTION_POINT_LAT, PREDICATE_DISTRIBUTION_POINT_LONG, PREDICATE_DISTRIBUTION_POLYGON, PREDICATE_DISTRIBUTION_POLYGON_POINT, PREDICATE_DISTRIBUTION_POLYGON_POINT_ALT, PREDICATE_DISTRIBUTION_POLYGON_POINT_ATTR_ALT_UNIT, PREDICATE_DISTRIBUTION_POLYGON_POINT_ATTR_GEODETIC_DATUM, PREDICATE_DISTRIBUTION_POLYGON_POINT_LAT, PREDICATE_DISTRIBUTION_POLYGON_POINT_LONG, PREDICATE_DOMAIN_ARCHITECTURE, PREDICATE_DOMAIN_ARCHITECTURE_ATTR_LENGTH, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_ATTR_CONFIDENCE, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_ATTR_FROM, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_ATTR_ID, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_ATTR_TO, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_VALUE, PREDICATE_EVENTS, PREDICATE_EVENTS_CONFIDENCE, PREDICATE_EVENTS_CONFIDENCE_ATTR_TYPE, PREDICATE_EVENTS_CONFIDENCE_VALUE, PREDICATE_EVENTS_DUPLICATIONS, PREDICATE_EVENTS_LOSSES, PREDICATE_EVENTS_SPECIATIONS, PREDICATE_EVENTS_TYPE, PREDICATE_NODE_ID, PREDICATE_NODE_ID_ATTR_PROVIDER, PREDICATE_NODE_ID_VALUE, PREDICATE_PHYLOGENY_ATTR_BRANCH_LENGTH_UNIT, PREDICATE_PHYLOGENY_ATTR_REROOTABLE, PREDICATE_PHYLOGENY_ATTR_TYPE, PREDICATE_PHYLOGENY_DATE, PREDICATE_PHYLOGENY_DESCRIPTION, PREDICATE_PHYLOGENY_ID, PREDICATE_PHYLOGENY_ID_ATTR_PROVIDER, PREDICATE_PHYLOGENY_ID_VALUE, PREDICATE_PROPERTY, PREDICATE_PROPERTY_ATTR_APPLIES_TO, PREDICATE_PROPERTY_ATTR_ID_REF, PREDICATE_PROPERTY_ATTR_UNIT, PREDICATE_REFERENCE, PREDICATE_REFERENCE_ATTR_DOI, PREDICATE_REFERENCE_DESC, PREDICATE_REFERENCE_VALUE, PREDICATE_SEQ_REL, PREDICATE_SEQ_REL_ATTR_DISTANCE, PREDICATE_SEQ_REL_ATTR_IDREF0, PREDICATE_SEQ_REL_ATTR_IDREF1, PREDICATE_SEQ_REL_ATTR_TYPE, PREDICATE_SEQ_REL_CONFIDENCE, PREDICATE_SEQ_REL_CONFIDENCE_ATTR_TYPE, PREDICATE_SEQ_REL_CONFIDENCE_VALUE, PREDICATE_SEQUENCE, PREDICATE_SEQUENCE_ACCESSION, PREDICATE_SEQUENCE_ACCESSION_ATTR_COMMENT, PREDICATE_SEQUENCE_ACCESSION_ATTR_SOURCE, PREDICATE_SEQUENCE_ACCESSION_VALUE, PREDICATE_SEQUENCE_ATTR_ID_REF, PREDICATE_SEQUENCE_ATTR_TYPE, PREDICATE_SEQUENCE_CROSS_REFERENCES, PREDICATE_SEQUENCE_CROSS_REFERENCES_ACCESSION, PREDICATE_SEQUENCE_CROSS_REFERENCES_ACCESSION_ATTR_COMMENT, PREDICATE_SEQUENCE_CROSS_REFERENCES_ACCESSION_ATTR_SOURCE, PREDICATE_SEQUENCE_CROSS_REFERENCES_ACCESSION_VALUE, PREDICATE_SEQUENCE_GENE_NAME, PREDICATE_SEQUENCE_LOCATION, PREDICATE_SEQUENCE_MOL_SEQ, PREDICATE_SEQUENCE_MOL_SEQ_ATTR_IS_ALIGNED, PREDICATE_SEQUENCE_MOL_SEQ_VALUE, PREDICATE_SEQUENCE_NAME, PREDICATE_SEQUENCE_SYMBOL, PREDICATE_SEQUENCE_URI, PREDICATE_SEQUENCE_URI_ATTR_DESC, PREDICATE_SEQUENCE_URI_ATTR_TYPE, PREDICATE_SEQUENCE_URI_VALUE, PREDICATE_TAXONOMY, PREDICATE_TAXONOMY_AUTHORITY, PREDICATE_TAXONOMY_CODE, PREDICATE_TAXONOMY_COMMON_NAME, PREDICATE_TAXONOMY_ID, PREDICATE_TAXONOMY_ID_ATTR_PROVIDER, PREDICATE_TAXONOMY_ID_VALUE, PREDICATE_TAXONOMY_RANK, PREDICATE_TAXONOMY_SCIENTIFIC_NAME, PREDICATE_TAXONOMY_SYNONYM, PREDICATE_TAXONOMY_URI, PREDICATE_TAXONOMY_URI_ATTR_DESC, PREDICATE_TAXONOMY_URI_ATTR_TYPE, PREDICATE_TAXONOMY_URI_VALUE, PREDICATE_WIDTH, TAG_ABSENT, TAG_ACCESSION, TAG_ALPHA, TAG_ALT, TAG_ANNOTATION, TAG_AUTHORITY, TAG_BC, TAG_BINARY_CHARACTERS, TAG_BLUE, TAG_BRANCH_COLOR, TAG_BRANCH_LENGTH, TAG_BRANCH_WIDTH, TAG_CLADE, TAG_CLADE_RELATION, TAG_CODE, TAG_COMMON_NAME, TAG_CONFIDENCE, TAG_CROSS_REFERENCES, TAG_DATE, TAG_DESC, TAG_DESCRIPTION, TAG_DISTRIBUTION, TAG_DOMAIN, TAG_DOMAIN_ARCHITECTURE, TAG_DUPLICATIONS, TAG_EVENTS, TAG_GAINED, TAG_GENE_NAME, TAG_GREEN, TAG_ID, TAG_LAT, TAG_LOCATION, TAG_LONG, TAG_LOSSES, TAG_LOST, TAG_MAXIMUM, TAG_MINIMUM, TAG_MOL_SEQ, TAG_NAME, TAG_NODE_ID, TAG_PHYLOGENY, TAG_POINT, TAG_POLYGON, TAG_PRESENT, TAG_PROPERTY, TAG_RANK, TAG_RED, TAG_REFERENCE, TAG_ROOT, TAG_SCI_NAME, TAG_SEQUENCE, TAG_SEQUENCE_RELATION, TAG_SPECIATIONS, TAG_SYMBOL, TAG_SYNONYM, TAG_TAXONOMY, TAG_TYPE, TAG_URI, TAG_VALUE, TYPE_NETWORK_EDGE
ATTRIBUTE_STRING_KEY, ATTRIBUTES_NAMESPACE_FOLDER, DATA_TYPE_NAMESPACE_FOLDER, DATA_TYPE_SIMPLE_VALUE_LIST, DEFAULT_CHAR_SET_ID_PREFIX, DEFAULT_CHARACTER_DEFINITION_ID_PREFIX, DEFAULT_EDGE_ID_PREFIX, DEFAULT_GENERAL_ID_PREFIX, DEFAULT_MATRIX_ID_PREFIX, DEFAULT_MAX_COMMENT_LENGTH, DEFAULT_MAX_TOKENS_TO_READ, DEFAULT_META_ID_PREFIX, DEFAULT_NETWORK_ID_PREFIX, DEFAULT_NODE_EDGE_SET_ID_PREFIX, DEFAULT_NODE_ID_PREFIX, DEFAULT_OTU_ID_PREFIX, DEFAULT_OTU_LIST_ID_PREFIX, DEFAULT_OTU_SET_ID_PREFIX, DEFAULT_SEQUENCE_ID_PREFIX, DEFAULT_SEQUENCE_SET_ID_PREFIX, DEFAULT_TOKEN_DEFINITION_ID_PREFIX, DEFAULT_TOKEN_SET_ID_PREFIX, DEFAULT_TREE_ID_PREFIX, DEFAULT_TREE_NETWORK_GROUP_ID_PREFIX, DEFAULT_TREE_NETWORK_SET_ID_PREFIX, JPHYLOIO_ATTRIBUTES_NAMESPACE, JPHYLOIO_ATTRIBUTES_PREFIX, JPHYLOIO_DATA_TYPE_NAMESPACE, JPHYLOIO_DATA_TYPE_PREFIX, JPHYLOIO_FORMATS_NAMESPACE_PREFIX, JPHYLOIO_GENERAL_NAMESPACE, JPHYLOIO_NAMESPACE_PREFIX, JPHYLOIO_PREDICATE_NAMESPACE, JPHYLOIO_PREDICATE_PREFIX, PREDICATE_CHARACTER_COUNT, PREDICATE_EDGE_LENGTH, PREDICATE_EDGE_SOURCE_NODE, PREDICATE_EDGE_TARGET_NODE, PREDICATE_HAS_CUSTOM_XML, PREDICATE_HAS_LITERAL_METADATA, PREDICATE_HAS_RESOURCE_METADATA, PREDICATE_IS_CROSSLINK, PREDICATE_NAMESPACE_FOLDER, PREDICATE_PART_SEPERATOR, PREDICATE_SEQUENCE_COUNT, RESERVED_ID_PREFIX
Constructor and Description |
---|
PhyloXMLEventReader(java.io.File file,
ReadWriteParameterMap parameters) |
PhyloXMLEventReader(java.io.InputStream stream,
ReadWriteParameterMap parameters) |
PhyloXMLEventReader(java.io.Reader reader,
ReadWriteParameterMap parameters) |
PhyloXMLEventReader(javax.xml.stream.XMLEventReader xmlReader,
ReadWriteParameterMap parameters) |
Modifier and Type | Method and Description |
---|---|
protected PhyloXMLReaderStreamDataProvider |
createStreamDataProvider()
This method is called in the constructor of
AbstractEventReader to initialize the stream
data provider that will be returned by AbstractEventReader.getStreamDataProvider() . |
protected void |
fillMap() |
java.lang.String |
getFormatID()
Returns a string ID uniquely identifying the target format of this instance.
|
static javax.xml.namespace.QName |
readDatatypeAttributeValue(java.lang.String datatype,
javax.xml.stream.events.StartElement element) |
close, createMetaXMLEventReader, createMetaXMLStreamReader, getElementReader, getElementReaderMap, getEncounteredTags, getNamespaceContext, getXMLReader, isAllowDefaultNamespace, parseQName, putElementReader, readNextEvent
addEventListener, fireEvent, getCurrentEventCollection, getIDManager, getLastNonCommentEvent, getParameters, getParentInformation, getPreviousEvent, getSequenceTokensEventManager, getStreamDataProvider, getUpcomingEvents, hasNextEvent, hasSpecialEventCollection, isBeforeFirstAccess, next, nextOfType, peek, removeEventListener, resetCurrentEventCollection, setCurrentEventCollection
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
addEventListener, getLastNonCommentEvent, getParentInformation, getPreviousEvent, hasNextEvent, next, nextOfType, peek, removeEventListener
public PhyloXMLEventReader(java.io.File file, ReadWriteParameterMap parameters) throws java.io.IOException, javax.xml.stream.XMLStreamException
java.io.IOException
javax.xml.stream.XMLStreamException
public PhyloXMLEventReader(java.io.InputStream stream, ReadWriteParameterMap parameters) throws java.io.IOException, javax.xml.stream.XMLStreamException
java.io.IOException
javax.xml.stream.XMLStreamException
public PhyloXMLEventReader(java.io.Reader reader, ReadWriteParameterMap parameters) throws java.io.IOException, javax.xml.stream.XMLStreamException
java.io.IOException
javax.xml.stream.XMLStreamException
public PhyloXMLEventReader(javax.xml.stream.XMLEventReader xmlReader, ReadWriteParameterMap parameters)
public java.lang.String getFormatID()
JPhyloIOFormatSpecificObject
JPhyloIOReaderWriterFactory.getFormatInfo(String)
.
Third party developers that create readers or writers for additional formats must make sure to use a globally unique
format ID. It is strongly recommended to use owned reverse domain names for this (e.g.
org.example.additionalformat
).
getFormatID
in interface JPhyloIOFormatSpecificObject
JPhyloIOReaderWriterFactory.getFormatInfo(String)
protected void fillMap()
fillMap
in class AbstractXMLEventReader<PhyloXMLReaderStreamDataProvider>
protected PhyloXMLReaderStreamDataProvider createStreamDataProvider()
AbstractEventReader
AbstractEventReader
to initialize the stream
data provider that will be returned by AbstractEventReader.getStreamDataProvider()
. Inherit classes that use
their own stream data provider implementation should overwrite this method.
This default implementation creates a new instance of ReaderStreamDataProvider
.
createStreamDataProvider
in class AbstractXMLEventReader<PhyloXMLReaderStreamDataProvider>
public static javax.xml.namespace.QName readDatatypeAttributeValue(java.lang.String datatype, javax.xml.stream.events.StartElement element) throws JPhyloIOReaderException
JPhyloIOReaderException