jTessBoxEditor

Vietnamese Optical Character Recognition

Moderator: quân

jTessBoxEditor

Postby quân » Sun Apr 10, 2011 2:53 pm

jTessBoxEditor is a box editor for Tesseract OCR data, providing editing of box data of both Tesseract 2.0x and 3.0x formats. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 6.0 or later.

http://vietocr.sourceforge.net/training.html
quân
 
Posts: 236
Joined: Sat Nov 16, 2002 1:51 am
Location: Oxnard, CA - USA

Re: jTessBoxEditor

Postby quân » Wed Oct 05, 2011 2:45 am

Version 0.6 adds a utility function that creates TIFF/Box pair suitable for training with Tesseract. The box generated is nearly identical to the one generated by tesseract; it can be used to validate one another. You can use a Unicode-compatible file compare tool, such as WinMerge, to compare the box files.
quân
 
Posts: 236
Joined: Sat Nov 16, 2002 1:51 am
Location: Oxnard, CA - USA

Re: jTessBoxEditor

Postby quân » Sun Jun 17, 2012 1:53 pm

jTessBoxEditor v0.7 has been released with the following enhancements:

- Fix an issue with opening Help file on OS X
- For TIFF/Box generation:
    - increase line spacing
    - abbreviate bold/italic font style to b/i for filename
    - add a Prefix (Language Code) textbox
    - add support for text anti-aliasing
Also, the PowerShell script train.ps1 has been updated to automate training with Tesseract 3.02 on Windows platform.
quân
 
Posts: 236
Joined: Sat Nov 16, 2002 1:51 am
Location: Oxnard, CA - USA

Re: jTessBoxEditor

Postby quân » Thu Apr 18, 2013 3:42 am

jTessBoxEditor v0.8 has been released with the following enhancements:

- Add row number header
- Char cell now editable
- Convert Unicode escape sequences where possible
- Find box now displays Unicode characters and allows search using Unicode escape sequences
- Improve Generate TIFF/Box functionality:
    * automatically combine boxes that have the same coordinates or completely encloses one another
    * automatically combine boxes that are combining symbols, specified in an external file, with the main, base character
    * retain last-modified exp number in filename

http://vietocr.sourceforge.net/training.html
quân
 
Posts: 236
Joined: Sat Nov 16, 2002 1:51 am
Location: Oxnard, CA - USA

Re: jTessBoxEditor

Postby quân » Fri May 03, 2013 12:19 am

jTessBoxEditor v0.9 Release:

- Enhance Generate TIFF/Box functionality to allow for combining prepending symbols in addition to appending
- Fix a bug that failed to persist changes to table in edit mode
- Find function now supports partial matches
- Fix a problem with table not scrolling along when row header has focus and scrolling
quân
 
Posts: 236
Joined: Sat Nov 16, 2002 1:51 am
Location: Oxnard, CA - USA

Re: jTessBoxEditor

Postby quân » Tue Nov 19, 2013 4:38 am

jTessBoxEditor v1.0 Release:

jTessBoxEditor is a box editor and trainer for Tesseract OCR, providing editing of box data of both Tesseract 2.0x and 3.0x formats and full automation of Tesseract training. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 6.0 or later.

This release includes the following improvements:

- Integrate support for full automation of Tesseract training
- Bundle Tesseract Windows training executables (r866), English data, and config files
- Fix an issue with generated TIFF missing metadata
- Optionally add noise to generated image
- Bug fixes and improvements

http://vietocr.sourceforge.net/training.html
http://sf.net/projects/vietocr/files/jTessBoxEditor/
quân
 
Posts: 236
Joined: Sat Nov 16, 2002 1:51 am
Location: Oxnard, CA - USA


Return to VietOCR

Who is online

Users browsing this forum: No registered users and 1 guest