Tuesday, November 8, 2011

Sigil and Data Loss Bugs

The majority of the data loss issues have been mitigated at this point. With a work flow of open, save as after major changes and saving after minor ones, catastrophic data loss can be worked around to the point that Sigil can and is being used on a day to day basis.

That said, there are issues with data loss in Sigil and they are a priority. I'm currently finishing up the 0.5 release (I do not have a set release date at this point) which is mainly a feature release and only addresses some of the the data loss issue. For example you can still have everything in an entire XHTML document removed by putting a malformed XML header in the document.

The issue has three components that require major work to fix. I hope to have it all completed for the 0.6 release but it's going to be some time it's ready.

The issues are:

1) Sigil currently uses Tidy to clean all XHTML to ensure it conforms (as much as it can) to the XHTML spec. I have seen Tidy remove tags it thinks are empty when they influence how the document is rendered. I want to keep Tidy as part of Sigil but I believe it should only be run when the user asks for it and any changes it makes the user should be able to revert.

2) An intermediate data store is used that requires valid XML is used. This store shuffles data between the book and code view. Due to this store requiring valid XML (valid XHTML conforms) there is the potential for data loss if it has to auto correct the XHTML. If you are in code view and have malformed structural issues with the XHTML and move out of it there is a warning dialog. This only appears when you are working on one file at a time. If you are replacing across multiple files auto correction is used and this can lead to data loss. This data store needs to be replaced with one that does not require valid XML.

3) Putting malformed content into the book view will cause the book view to try to correct it. Again auto correction can lead to data loss. This is mitigated by the malformed error dialog but many users just disable it and find that sections of their document are missing after looking at it in book view. Also, the book view is a WYSIWYG tool so it does make structural changes to the document and these may or may not be what the user expects. As with Tidy changes made by the book view need to be able to be reverted. I am thinking about ways to make the fact that the book view more obvious that it makes changes to the document. This way the user is aware that they need to use undo (doesn't currently work for book view changes) to revert the changes if they don't like them. I'm thinking about using a preview mode by default that doesn't make any changes and an edit mode to make this distinction obvious.

The above issues can be fixed but they are not quick or easy changes. I plan on making them for the 0.6 release as part of the changes necessary to support EPUB 3. However, there is the possibility that they will slip to 0.7 due to how large they are. Unfortunately, all I can say right now is I'm aware of the issue, I know what the cause is, and I have an idea of how to correct it but it's not going to happen tomorrow.