Some trains have double train numbers, e.g. ICE 234, ICE 136. In the PDF I have the second number is on a new line (with the arrival station), so I check that line and append the result to the number if found. This broke
things for international tickets, so only do it for domestic ones.
Details
- Reviewers
vkrause
Tested with domestic one (with double train number) and two international one (without double train number). Needs some more thorough testing with more PDFs
Diff Detail
- Repository
- R1003 KItinerary: Travel Reservation handling library
- Branch
- multitrainnumber
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 2660 Build 2678: arc lint + arc unit
Thanks for looking into this!
This seems to break the unit tests (unstructureddataextractortest) unfortunately, as well as the tests on my ticket collection (with similar symptoms).
I wonder if we can assume that the second train number has the same type, i.e. it will always be "ICE 123, ICE 234" and not "ICE 123, FOO 234". Or any other assumption what a valid train number looks like. Then we could match for that in the second line and avoid getting stuff in that isn't a train number
Good question. For the "Flügelzug" configuration in your test case it's always the same type I think. However, I'm not sure if the train-equivalent of "code shares" exists, e.g. on international ICE/TGV/Thalys/etc services.
The lesson I learned on trying to find patterns for train numbers (or platforms/station names) so far is: don't, there's always some corner case breaking this ;-)