java.lang.Object
com.tectonica.jonix.JonixRecords
- All Implemented Interfaces:
Iterable<JonixRecord>
This class provides the mechanism to scan one or more ONIX sources, and process the ONIX records they contain
(typically an ONIX
Header
followed by one or more ONIX Product
records).
The normal preparation steps of this class are as follows:
- Add one or more ONIX sources
- Set the expected encoding of the sources (default is
UTF-8
) - Optionally, set event handlers to be fired during processing
- Optionally, set key-value pairs, which will be accessible conveniently during processing
Example:
JonixRecords records = Jonix .source(new File("/path/to/folder-with-onix-files"), "*.xml", false) .source(new File("/path/to/file-with-short-style-onix-2.xml")) .source(new File("/path/to/file-with-reference-style-onix-3.onx")) .onSourceStart(src -> { // take a look at: // src.onixVersion() // src.header() // src.sourceName() }) .onSourceEnd(src -> { // take a look at: // src.productsProcessedCount() }) .failOnInvalidFile(false);
Once the JonixRecords
is prepared, processing can be done in several ways:
Iteration
First and foremost,JonixRecords
is an Iterable
of JonixRecord
. Hence, it can be iterated
over with a simple for
loop.
The following loop iterates over the ONIX Products in all sources, and handles them whether they're of version Onix2
or Onix3.
for (JonixRecord record : records) { if (record.product instanceof com.tectonica.jonix.onix3.Product) { com.tectonica.jonix.onix3.Product product3 = (com.tectonica.jonix.onix3.Product) record.product; // TODO: process the Onix3 <Product> } else if (record.product instanceof com.tectonica.jonix.onix2.Product) { com.tectonica.jonix.onix2.Product product2 = (com.tectonica.jonix.onix2.Product) record.product; // TODO: process the Onix2 <Product> } else { throw new IllegalArgumentException(); } }To continue this example of low-level handling (staying very close to the structure of the XML data), the following is an elaborate version of the code above, pulling out the ISBN and first contributor from all ONIX Products:
for (JonixRecord record : records) { String isbn13; String personName = null; List<ContributorRoles> roles = null; if (record.product instanceof com.tectonica.jonix.onix2.Product) { com.tectonica.jonix.onix2.Product product2 = (com.tectonica.jonix.onix2.Product) record.product; isbn13 = product2.productIdentifiers() .find(ProductIdentifierTypes.ISBN_13) .map(pid -> pid.idValue().value) .orElse(null); List<com.tectonica.jonix.onix2.Contributor> contributors = product2.contributors(); if (!contributors.isEmpty()) { com.tectonica.jonix.onix2.Contributor firstContributor = contributors.get(0); roles = firstContributor.contributorRoles().values(); personName = firstContributor.personName().value; } } else if (record.product instanceof com.tectonica.jonix.onix3.Product) { com.tectonica.jonix.onix3.Product product3 = (com.tectonica.jonix.onix3.Product) record.product; isbn13 = product3.productIdentifiers() .find(ProductIdentifierTypes.ISBN_13) .map(pid -> pid.idValue().value) .orElse(null); List<com.tectonica.jonix.onix3.Contributor> contributors = product3.descriptiveDetail().contributors(); if (!contributors.isEmpty()) { com.tectonica.jonix.onix3.Contributor firstContributor = contributors.get(0); roles = firstContributor.contributorRoles().values(); personName = firstContributor.personName().value; } } else { throw new IllegalArgumentException(); } System.out .println(String.format("Found ISBN %s, first person is %s, his roles: %s", isbn13, personName, roles)); }
Streaming
It is sometime useful to invokestream()
and use the resulting Stream
along with Java 8
Streaming APIs to achieve greater readability. The following examples retrieves the Onix3 Products from their sources
and stores them in an in-memory List
:
import com.tectonica.jonix.onix3.Product; ... List<Product> products3 = records.stream() .filter(rec -> rec.product instanceof Product) .map(rec -> (Product) rec.product) .collect(Collectors.toList());
Streaming as Unified Record
One of Jonix's best facilities is theUnification
framework, allowing to simplify the treatment in
varied sources (Onix2 mixed with Onix3 files) and eliminate some of the intricacies of XML handling.
The method streamUnified()
returns a Stream
, but not of the low-level
JonixRecord
s. Instead it streams out BaseRecord
s, that contains typed and unified representation of
the most essential data within typical ONIX source. The following examples shows how simple it is to extract data
from ONIX source without the inherent complications of ONIX diversity:
Set<PriceTypes> requestedPrices = JonixUtil.setOf(PriceTypes.RRP_including_tax, PriceTypes.RRP_excluding_tax); records.streamUnified() .map(rec -> rec.product) .forEach(product -> { String recordReference = product.info.recordReference; String isbn13 = product.info.findProductId(ProductIdentifierTypes.ISBN_13); String title = product.titles.findTitleText(TitleTypes.Distinctive_title_book); List<String> authors = product.contributors.getDisplayNames(ContributorRoles.By_author); List<BasePrice> prices = product.supplyDetails.findPrices(requestedPrices); List<String> priceLabels = prices.stream() .map(bp -> bp.priceAmountAsStr + " " + bp.currencyCode).collect(Collectors.toList()); System.out.println(String.format("Found product ref. %s, ISBN='%s', Title='%s', authors=%s, prices=%s", recordReference, isbn13, title, authors, priceLabels)); });
-
Nested Class Summary
-
Field Summary
-
Method Summary
Modifier and TypeMethodDescriptionfailOnInvalidFile
(boolean fail) This method sets the streaming policy when invalid sources are encountered (e.g.iterator()
onSourceEnd
(JonixRecords.OnSourceEvent onSourceEnd) Registers a listener forSourceEnd
event, which occurs when after all records have been processed in the recently opened source.onSourceStart
(JonixRecords.OnSourceEvent onSourceStart) Registers a listener forSourceStart
event, which occurs when a new source is about to be processed but only after the ONIX version and the (optional) ONIX Header have been parsed.<T> T
<T> T
This will "peek" into theHeader
s of the indicated ONIX sources, without processing theProduct
s.<T> JonixRecords
Stores an object for later use during the streaming process.stream()
streamUnified
(BaseFactory2 baseFactory2, BaseFactory3 baseFactory3) <P extends UnifiedProduct,
H extends UnifiedHeader, R extends UnifiedRecord<P>>
Stream<R>streamUnified
(CustomUnifier<P, H, R> unifier) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Field Details
-
globalProductCount
-
failOnInvalidFile
protected boolean failOnInvalidFile
-
-
Method Details
-
source
-
source
-
source
- Throws:
IOException
-
failOnInvalidFile
This method sets the streaming policy when invalid sources are encountered (e.g. file not found). The default behavior is to stop streaming when such error occurs. -
store
Stores an object for later use during the streaming process. The stored object can be retrieved withretrieve(String)
. -
retrieve
- Returns:
- an object stored with
store(String, Object)
during the streaming, ornull
if thekey
doesn't exist
-
retrieve
- Returns:
- an object stored with
store(String, Object)
during the streaming, ordefaultValue
if thekey
doesn't exist
-
getConfiguration
-
encoding
-
onSourceStart
Registers a listener forSourceStart
event, which occurs when a new source is about to be processed but only after the ONIX version and the (optional) ONIX Header have been parsed. These will be available in theJonixSource
of theJonixRecords.OnSourceEvent
.NOTE: this method can be called more than once to register several event-listeners
-
onSourceEnd
Registers a listener forSourceEnd
event, which occurs when after all records have been processed in the recently opened source. In addition to all the information that was available for event-listeners registered withonSourceStart(OnSourceEvent)
, theJonixSource
when this event is fired also includesJonixSource.productCount()
, with the final count of ONIX Products processed from the source.NOTE: this method can be called more than once to register several event-listeners
-
stream
- Returns:
- a
Stream
of records, each containing a newProduct
object and a reference to the source from which it was taken
-
streamUnified
- Returns:
- a
Stream
of records, each containing a newBaseProduct
object and a reference to the source from which it was taken
-
streamUnified
- Returns:
- a
Stream
of records, each containing a newBaseProduct
object (which was created using the given factories) and a reference to the source from which it was taken
-
streamUnified
public <P extends UnifiedProduct,H extends UnifiedHeader, Stream<R> streamUnifiedR extends UnifiedRecord<P>> (CustomUnifier<P, H, R> unifier) - Returns:
- a
Stream
of records, each containing a new customProduct
object (which was created using the givenCustomUnifier
) and a reference to the source from which it was taken
-
scanHeaders
This will "peek" into theHeader
s of the indicated ONIX sources, without processing theProduct
s. TheonSourceStart()
events will be fired as a result, allowing to handle the header information. -
iterator
- Specified by:
iterator
in interfaceIterable<JonixRecord>
-