public class PhyloXMLEventReader extends AbstractXMLEventReader<PhyloXMLReaderStreamDataProvider> implements PhyloXMLConstants
Since trees are represented by a hierarchical structure of clade tags in PhyloXML, they need
to be serialized to an event sequence according to the JPhyloIO grammar while reading. To achieve this,
node and edge information is buffered until a clade end tag is reached. NodeEvents and
EdgeEvents representing all edges leading to children of this node are then fired, including all nested
metaevents. If custom XML is encountered before a clade end, events are fired then to avoid buffering large
amounts of custom data. Since the predefined elements do not contain large amounts of data, buffering
this information does not make reading significantly more inefficient. Performance problems may occur if large
molecular sequences are attached to the phylogeny.
Custom XML is read in all positions, where no other element reader is registered. This includes custom elements nested under elements where this is illegal. Only registering custom element readers under tags, where this is valid, would require to buffer the whole custom contents.
Predefined data elements are represented as LiteralMetadataEvent or ResourceMetadataEvent with
specific internally used predicates. ResourceMetadataEvents may be used to group events representing attribute
values and element contents. Property tags are represented by LiteralMetadataEvent with the
value of the ref attribute as a predicate. If other attributes are present or the value of the
applies_to attribute indicates a different position than the element is actually found in, these
attribute values and the content are grouped by a ResourceMetadataEvent. The content of a property
tag is translated to a Java object using classes of the type ObjectTranslator.
Phylogenies in PhyloXML files can either be interpreted as phylogenetic trees or rooted networks, depending on
the value of the parameter ReadWriteParameterNames.KEY_PHYLOXML_CONSIDER_PHYLOGENY_AS_TREE.
If it is interpreted as a network, edges defined by clade_rel tags are represented by edge events with a
nested meta event with the predicate ReadWriteConstants.PREDICATE_IS_CROSSLINK, otherwise they are
represented by meta events. By default this reader considers pyhlogenies a networks, if trees shall be read instead,
this parameter must be provided and set to true.
Element IDs found in a PhyloXML document (specified as the value of the id_source-attribute) are not the same
as the event IDs generated by JPhyloIO. If an element ID is encountered it is represented by a LiteralMetadataEvent
with the predicate ReadWriteConstants#PREDICATE_ATTR_ID_SOURCE. If a reference to an element ID is encountered
(specified as the value of an id_ref-attribute) it is represented by a LiteralMetadataEvent with an
according predicate (e.g. ReadWriteConstants#PREDICATE_SEQUENCE_ATTR_ID_REF) as well.
The mapping of id source values to event IDs can be obtained from the parameter map under the key
ReadWriteParameterNames.KEY_PHYLOXML_EVENT_ID_TRANSLATION_MAP.
In case of the clade_rel tag the JPhyloIO event IDs are given in additional LiteralMetadataEvents
with the predicates ReadWriteConstants.PREDICATE_EDGE_SOURCE_NODE and ReadWriteConstants.PREDICATE_EDGE_TARGET_NODE.
PhyloXMLConstants,
Metadata demo applicationINTERNAL_USE_NAMESPACE, TAG_PARENT_OF_ROOTAPPLIES_TO_ANNOTATION, APPLIES_TO_CLADE, APPLIES_TO_NODE, APPLIES_TO_OTHER, APPLIES_TO_PARENT_BRANCH, APPLIES_TO_PHYLOGENY, ATTR_ABSENT_COUNT, ATTR_ALT_UNIT, ATTR_APPLIES_TO, ATTR_BRANCH_LENGTH, ATTR_BRANCH_LENGTH_UNIT, ATTR_COLLAPSE, ATTR_COMMENT, ATTR_CONFIDENCE, ATTR_DATATYPE, ATTR_DESC, ATTR_DISTANCE, ATTR_DOI, ATTR_EVIDENCE, ATTR_FROM, ATTR_GAINED_COUNT, ATTR_GEO_DATUM, ATTR_ID, ATTR_ID_PROVIDER, ATTR_ID_REF, ATTR_ID_REF_0, ATTR_ID_REF_1, ATTR_ID_SOURCE, ATTR_IS_ALIGNED, ATTR_LENGTH, ATTR_LOST_COUNT, ATTR_PRESENT_COUNT, ATTR_REF, ATTR_REROOTABLE, ATTR_ROOTED, ATTR_SOURCE, ATTR_STANDARD_DEVIATION, ATTR_TO, ATTR_TYPE, ATTR_UNIT, DATA_TYPE_BRANCH_COLOR, DATA_TYPE_EVENTTYPE, DATA_TYPE_RANK, DATA_TYPE_SEQUENCE_SYMBOL, DATA_TYPE_TAXONOMY_CODE, JPHYLOIO_PHYLOXML_NAMESPACE, PHYLOXML_DATA_TYPE_NAMESPACE, PHYLOXML_DEFAULT_PRE, PHYLOXML_FORMAT_NAME, PHYLOXML_NAMESPACE, PHYLOXML_PREDICATE_NAMESPACE, PHYLOXML_SCHEMA_LOCATION_URI, PREDICATE_ANNOTATION, PREDICATE_ANNOTATION_ATTR_EVIDENCE, PREDICATE_ANNOTATION_ATTR_REF, PREDICATE_ANNOTATION_ATTR_SOURCE, PREDICATE_ANNOTATION_ATTR_TYPE, PREDICATE_ANNOTATION_CONFIDENCE, PREDICATE_ANNOTATION_CONFIDENCE_ATTR_TYPE, PREDICATE_ANNOTATION_CONFIDENCE_VALUE, PREDICATE_ANNOTATION_DESC, PREDICATE_ANNOTATION_PROPERTY, PREDICATE_ANNOTATION_PROPERTY_ATTR_APPLIES_TO, PREDICATE_ANNOTATION_PROPERTY_ATTR_DATATYPE, PREDICATE_ANNOTATION_PROPERTY_ATTR_ID_REF, PREDICATE_ANNOTATION_PROPERTY_ATTR_UNIT, PREDICATE_ANNOTATION_URI, PREDICATE_ANNOTATION_URI_ATTR_DESC, PREDICATE_ANNOTATION_URI_ATTR_TYPE, PREDICATE_ANNOTATION_URI_VALUE, PREDICATE_ATTR_COLLAPSE, PREDICATE_ATTR_ID_SOURCE, PREDICATE_BINARY_CHARACTERS, PREDICATE_BINARY_CHARACTERS_ABSENT, PREDICATE_BINARY_CHARACTERS_ABSENT_BC, PREDICATE_BINARY_CHARACTERS_ATTR_ABSENT_COUNT, PREDICATE_BINARY_CHARACTERS_ATTR_GAINED_COUNT, PREDICATE_BINARY_CHARACTERS_ATTR_LOST_COUNT, PREDICATE_BINARY_CHARACTERS_ATTR_PRESENT_COUNT, PREDICATE_BINARY_CHARACTERS_ATTR_TYPE, PREDICATE_BINARY_CHARACTERS_GAINED, PREDICATE_BINARY_CHARACTERS_GAINED_BC, PREDICATE_BINARY_CHARACTERS_LOST, PREDICATE_BINARY_CHARACTERS_LOST_BC, PREDICATE_BINARY_CHARACTERS_PRESENT, PREDICATE_BINARY_CHARACTERS_PRESENT_BC, PREDICATE_CLADE_REL, PREDICATE_CLADE_REL_ATTR_DISTANCE, PREDICATE_CLADE_REL_ATTR_IDREF0, PREDICATE_CLADE_REL_ATTR_IDREF1, PREDICATE_CLADE_REL_ATTR_TYPE, PREDICATE_COLOR, PREDICATE_COLOR_ALPHA, PREDICATE_COLOR_BLUE, PREDICATE_COLOR_GREEN, PREDICATE_COLOR_RED, PREDICATE_CONFIDENCE, PREDICATE_CONFIDENCE_ATTR_STDDEV, PREDICATE_CONFIDENCE_ATTR_TYPE, PREDICATE_CONFIDENCE_VALUE, PREDICATE_DATE, PREDICATE_DATE_ATTR_UNIT, PREDICATE_DATE_DESC, PREDICATE_DATE_MAXIMUM, PREDICATE_DATE_MINIMUM, PREDICATE_DATE_VALUE, PREDICATE_DISTRIBUTION, PREDICATE_DISTRIBUTION_DESC, PREDICATE_DISTRIBUTION_POINT, PREDICATE_DISTRIBUTION_POINT_ALT, PREDICATE_DISTRIBUTION_POINT_ATTR_ALT_UNIT, PREDICATE_DISTRIBUTION_POINT_ATTR_GEODETIC_DATUM, PREDICATE_DISTRIBUTION_POINT_LAT, PREDICATE_DISTRIBUTION_POINT_LONG, PREDICATE_DISTRIBUTION_POLYGON, PREDICATE_DISTRIBUTION_POLYGON_POINT, PREDICATE_DISTRIBUTION_POLYGON_POINT_ALT, PREDICATE_DISTRIBUTION_POLYGON_POINT_ATTR_ALT_UNIT, PREDICATE_DISTRIBUTION_POLYGON_POINT_ATTR_GEODETIC_DATUM, PREDICATE_DISTRIBUTION_POLYGON_POINT_LAT, PREDICATE_DISTRIBUTION_POLYGON_POINT_LONG, PREDICATE_DOMAIN_ARCHITECTURE, PREDICATE_DOMAIN_ARCHITECTURE_ATTR_LENGTH, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_ATTR_CONFIDENCE, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_ATTR_FROM, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_ATTR_ID, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_ATTR_TO, PREDICATE_DOMAIN_ARCHITECTURE_DOMAIN_VALUE, PREDICATE_EVENTS, PREDICATE_EVENTS_CONFIDENCE, PREDICATE_EVENTS_CONFIDENCE_ATTR_TYPE, PREDICATE_EVENTS_CONFIDENCE_VALUE, PREDICATE_EVENTS_DUPLICATIONS, PREDICATE_EVENTS_LOSSES, PREDICATE_EVENTS_SPECIATIONS, PREDICATE_EVENTS_TYPE, PREDICATE_NODE_ID, PREDICATE_NODE_ID_ATTR_PROVIDER, PREDICATE_NODE_ID_VALUE, PREDICATE_PHYLOGENY_ATTR_BRANCH_LENGTH_UNIT, PREDICATE_PHYLOGENY_ATTR_REROOTABLE, PREDICATE_PHYLOGENY_ATTR_TYPE, PREDICATE_PHYLOGENY_DATE, PREDICATE_PHYLOGENY_DESCRIPTION, PREDICATE_PHYLOGENY_ID, PREDICATE_PHYLOGENY_ID_ATTR_PROVIDER, PREDICATE_PHYLOGENY_ID_VALUE, PREDICATE_PROPERTY, PREDICATE_PROPERTY_ATTR_APPLIES_TO, PREDICATE_PROPERTY_ATTR_ID_REF, PREDICATE_PROPERTY_ATTR_UNIT, PREDICATE_REFERENCE, PREDICATE_REFERENCE_ATTR_DOI, PREDICATE_REFERENCE_DESC, PREDICATE_REFERENCE_VALUE, PREDICATE_SEQ_REL, PREDICATE_SEQ_REL_ATTR_DISTANCE, PREDICATE_SEQ_REL_ATTR_IDREF0, PREDICATE_SEQ_REL_ATTR_IDREF1, PREDICATE_SEQ_REL_ATTR_TYPE, PREDICATE_SEQ_REL_CONFIDENCE, PREDICATE_SEQ_REL_CONFIDENCE_ATTR_TYPE, PREDICATE_SEQ_REL_CONFIDENCE_VALUE, PREDICATE_SEQUENCE, PREDICATE_SEQUENCE_ACCESSION, PREDICATE_SEQUENCE_ACCESSION_ATTR_COMMENT, PREDICATE_SEQUENCE_ACCESSION_ATTR_SOURCE, PREDICATE_SEQUENCE_ACCESSION_VALUE, PREDICATE_SEQUENCE_ATTR_ID_REF, PREDICATE_SEQUENCE_ATTR_TYPE, PREDICATE_SEQUENCE_CROSS_REFERENCES, PREDICATE_SEQUENCE_CROSS_REFERENCES_ACCESSION, PREDICATE_SEQUENCE_CROSS_REFERENCES_ACCESSION_ATTR_COMMENT, PREDICATE_SEQUENCE_CROSS_REFERENCES_ACCESSION_ATTR_SOURCE, PREDICATE_SEQUENCE_CROSS_REFERENCES_ACCESSION_VALUE, PREDICATE_SEQUENCE_GENE_NAME, PREDICATE_SEQUENCE_LOCATION, PREDICATE_SEQUENCE_MOL_SEQ, PREDICATE_SEQUENCE_MOL_SEQ_ATTR_IS_ALIGNED, PREDICATE_SEQUENCE_MOL_SEQ_VALUE, PREDICATE_SEQUENCE_NAME, PREDICATE_SEQUENCE_SYMBOL, PREDICATE_SEQUENCE_URI, PREDICATE_SEQUENCE_URI_ATTR_DESC, PREDICATE_SEQUENCE_URI_ATTR_TYPE, PREDICATE_SEQUENCE_URI_VALUE, PREDICATE_TAXONOMY, PREDICATE_TAXONOMY_AUTHORITY, PREDICATE_TAXONOMY_CODE, PREDICATE_TAXONOMY_COMMON_NAME, PREDICATE_TAXONOMY_ID, PREDICATE_TAXONOMY_ID_ATTR_PROVIDER, PREDICATE_TAXONOMY_ID_VALUE, PREDICATE_TAXONOMY_RANK, PREDICATE_TAXONOMY_SCIENTIFIC_NAME, PREDICATE_TAXONOMY_SYNONYM, PREDICATE_TAXONOMY_URI, PREDICATE_TAXONOMY_URI_ATTR_DESC, PREDICATE_TAXONOMY_URI_ATTR_TYPE, PREDICATE_TAXONOMY_URI_VALUE, PREDICATE_WIDTH, TAG_ABSENT, TAG_ACCESSION, TAG_ALPHA, TAG_ALT, TAG_ANNOTATION, TAG_AUTHORITY, TAG_BC, TAG_BINARY_CHARACTERS, TAG_BLUE, TAG_BRANCH_COLOR, TAG_BRANCH_LENGTH, TAG_BRANCH_WIDTH, TAG_CLADE, TAG_CLADE_RELATION, TAG_CODE, TAG_COMMON_NAME, TAG_CONFIDENCE, TAG_CROSS_REFERENCES, TAG_DATE, TAG_DESC, TAG_DESCRIPTION, TAG_DISTRIBUTION, TAG_DOMAIN, TAG_DOMAIN_ARCHITECTURE, TAG_DUPLICATIONS, TAG_EVENTS, TAG_GAINED, TAG_GENE_NAME, TAG_GREEN, TAG_ID, TAG_LAT, TAG_LOCATION, TAG_LONG, TAG_LOSSES, TAG_LOST, TAG_MAXIMUM, TAG_MINIMUM, TAG_MOL_SEQ, TAG_NAME, TAG_NODE_ID, TAG_PHYLOGENY, TAG_POINT, TAG_POLYGON, TAG_PRESENT, TAG_PROPERTY, TAG_RANK, TAG_RED, TAG_REFERENCE, TAG_ROOT, TAG_SCI_NAME, TAG_SEQUENCE, TAG_SEQUENCE_RELATION, TAG_SPECIATIONS, TAG_SYMBOL, TAG_SYNONYM, TAG_TAXONOMY, TAG_TYPE, TAG_URI, TAG_VALUE, TYPE_NETWORK_EDGEATTRIBUTE_STRING_KEY, ATTRIBUTES_NAMESPACE_FOLDER, DATA_TYPE_NAMESPACE_FOLDER, DATA_TYPE_SIMPLE_VALUE_LIST, DEFAULT_CHAR_SET_ID_PREFIX, DEFAULT_CHARACTER_DEFINITION_ID_PREFIX, DEFAULT_EDGE_ID_PREFIX, DEFAULT_GENERAL_ID_PREFIX, DEFAULT_MATRIX_ID_PREFIX, DEFAULT_MAX_COMMENT_LENGTH, DEFAULT_MAX_TOKENS_TO_READ, DEFAULT_META_ID_PREFIX, DEFAULT_NETWORK_ID_PREFIX, DEFAULT_NODE_EDGE_SET_ID_PREFIX, DEFAULT_NODE_ID_PREFIX, DEFAULT_OTU_ID_PREFIX, DEFAULT_OTU_LIST_ID_PREFIX, DEFAULT_OTU_SET_ID_PREFIX, DEFAULT_SEQUENCE_ID_PREFIX, DEFAULT_SEQUENCE_SET_ID_PREFIX, DEFAULT_TOKEN_DEFINITION_ID_PREFIX, DEFAULT_TOKEN_SET_ID_PREFIX, DEFAULT_TREE_ID_PREFIX, DEFAULT_TREE_NETWORK_GROUP_ID_PREFIX, DEFAULT_TREE_NETWORK_SET_ID_PREFIX, JPHYLOIO_ATTRIBUTES_NAMESPACE, JPHYLOIO_ATTRIBUTES_PREFIX, JPHYLOIO_DATA_TYPE_NAMESPACE, JPHYLOIO_DATA_TYPE_PREFIX, JPHYLOIO_FORMATS_NAMESPACE_PREFIX, JPHYLOIO_GENERAL_NAMESPACE, JPHYLOIO_NAMESPACE_PREFIX, JPHYLOIO_PREDICATE_NAMESPACE, JPHYLOIO_PREDICATE_PREFIX, PREDICATE_CHARACTER_COUNT, PREDICATE_EDGE_LENGTH, PREDICATE_EDGE_SOURCE_NODE, PREDICATE_EDGE_TARGET_NODE, PREDICATE_HAS_CUSTOM_XML, PREDICATE_HAS_LITERAL_METADATA, PREDICATE_HAS_RESOURCE_METADATA, PREDICATE_IS_CROSSLINK, PREDICATE_NAMESPACE_FOLDER, PREDICATE_PART_SEPERATOR, PREDICATE_SEQUENCE_COUNT, RESERVED_ID_PREFIX| Constructor and Description |
|---|
PhyloXMLEventReader(java.io.File file,
ReadWriteParameterMap parameters) |
PhyloXMLEventReader(java.io.InputStream stream,
ReadWriteParameterMap parameters) |
PhyloXMLEventReader(java.io.Reader reader,
ReadWriteParameterMap parameters) |
PhyloXMLEventReader(javax.xml.stream.XMLEventReader xmlReader,
ReadWriteParameterMap parameters) |
| Modifier and Type | Method and Description |
|---|---|
protected PhyloXMLReaderStreamDataProvider |
createStreamDataProvider()
This method is called in the constructor of
AbstractEventReader to initialize the stream
data provider that will be returned by AbstractEventReader.getStreamDataProvider(). |
protected void |
fillMap() |
java.lang.String |
getFormatID()
Returns a string ID uniquely identifying the target format of this instance.
|
static javax.xml.namespace.QName |
readDatatypeAttributeValue(java.lang.String datatype,
javax.xml.stream.events.StartElement element) |
close, createMetaXMLEventReader, createMetaXMLStreamReader, getElementReader, getElementReaderMap, getEncounteredTags, getNamespaceContext, getXMLReader, isAllowDefaultNamespace, parseQName, putElementReader, readNextEventaddEventListener, fireEvent, getCurrentEventCollection, getIDManager, getLastNonCommentEvent, getParameters, getParentInformation, getPreviousEvent, getSequenceTokensEventManager, getStreamDataProvider, getUpcomingEvents, hasNextEvent, hasSpecialEventCollection, isBeforeFirstAccess, next, nextOfType, peek, removeEventListener, resetCurrentEventCollection, setCurrentEventCollectionclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitaddEventListener, getLastNonCommentEvent, getParentInformation, getPreviousEvent, hasNextEvent, next, nextOfType, peek, removeEventListenerpublic PhyloXMLEventReader(java.io.File file, ReadWriteParameterMap parameters) throws java.io.IOException, javax.xml.stream.XMLStreamException
java.io.IOExceptionjavax.xml.stream.XMLStreamExceptionpublic PhyloXMLEventReader(java.io.InputStream stream, ReadWriteParameterMap parameters) throws java.io.IOException, javax.xml.stream.XMLStreamException
java.io.IOExceptionjavax.xml.stream.XMLStreamExceptionpublic PhyloXMLEventReader(java.io.Reader reader, ReadWriteParameterMap parameters) throws java.io.IOException, javax.xml.stream.XMLStreamException
java.io.IOExceptionjavax.xml.stream.XMLStreamExceptionpublic PhyloXMLEventReader(javax.xml.stream.XMLEventReader xmlReader, ReadWriteParameterMap parameters)
public java.lang.String getFormatID()
JPhyloIOFormatSpecificObjectJPhyloIOReaderWriterFactory.getFormatInfo(String).
Third party developers that create readers or writers for additional formats must make sure to use a globally unique
format ID. It is strongly recommended to use owned reverse domain names for this (e.g.
org.example.additionalformat).
getFormatID in interface JPhyloIOFormatSpecificObjectJPhyloIOReaderWriterFactory.getFormatInfo(String)protected void fillMap()
fillMap in class AbstractXMLEventReader<PhyloXMLReaderStreamDataProvider>protected PhyloXMLReaderStreamDataProvider createStreamDataProvider()
AbstractEventReaderAbstractEventReader to initialize the stream
data provider that will be returned by AbstractEventReader.getStreamDataProvider(). Inherit classes that use
their own stream data provider implementation should overwrite this method.
This default implementation creates a new instance of ReaderStreamDataProvider.
createStreamDataProvider in class AbstractXMLEventReader<PhyloXMLReaderStreamDataProvider>public static javax.xml.namespace.QName readDatatypeAttributeValue(java.lang.String datatype, javax.xml.stream.events.StartElement element) throws JPhyloIOReaderException
JPhyloIOReaderException