feat(markdown-transform): add transformation CiceroMark<->OOXML by K-Kumar-01 · Pull Request #424 · accordproject/markdown-transform

K-Kumar-01 · 2021-07-04T11:04:41Z

Signed-off-by: K-Kumar-01 kushalkumargupta4@gmail.com

Adds the transformation CiceroMark<->OOXML in the transformation graph.

Changes

Add the transformer to convert OOXML->CiceroMark
Add the transformer to convert CiceroMark->OOXML.

Author Checklist

Ensure you provide a DCO sign-off for your commits using the --signoff option of git commit.
Vital features and changes captured in unit and/or integration tests
Commits messages follow AP format
Extend the documentation, if necessary
Merging to master from fork:branchname

K-Kumar-01 · 2021-07-04T11:08:38Z

@dselman @algomaster99
I have created a PR to integrate the CiceroMark<->OOXML in markdown-cli.
I was not sure how to test it.

Also, the build fails due to a test. Merging PR will fix this.

algomaster99

Dropped the comments. Did you even try running the CLI with your changes? It was broken both ways (CiceroMark <-> OOXML) as I have elaborated in the two comments below.

algomaster99 · 2021-07-04T17:41:21Z

    },
+    ooxml: {
+        docs: 'OOXML',
+        fileFormat: 'binary',


It is not binary. It should be utf-8 only. If you read the output after executing ./packages/markdown-cli/index.js transform --from ciceromark_parsed --to ooxml --input input.json, you will just get <binary data> in the console. Moreover, you can even read and understand what's written in the file so binary makes no sense here.

algomaster99 · 2021-07-04T18:08:42Z

+        docs: 'OOXML',
+        fileFormat: 'binary',
+        ciceromark_parsed: async(input) => {
+            return OoxmlTransformer.toCiceroMark(input);


This would throw an error because toCiceroMark is not a static method.

algomaster99 · 2021-07-04T18:17:43Z

I was not sure how to test it.

Try taking inspiration from markdown-html.

K-Kumar-01 · 2021-07-05T12:38:58Z

@algomaster99
Had a doubt regarding CiceroMarkJSON->OOXML.
The test will fail because the OOXML styles differ. Shall I first generate an OOXML using the transformer first and then check?

Another approach for this can be to convert the OOXML to JSON using xmljs and check using deep.equal().

Which approach should I go for?

dselman · 2021-07-05T13:05:58Z

 const PdfTransformer = require('@accordproject/markdown-pdf').PdfTransformer;
 const DocxTransformer = require('@accordproject/markdown-docx').DocxTransformer;
+const OoxmlTransformer = require('@accordproject/markdown-docx').OoxmlTransformer;
+const CiceroMarkToOOXMLTransfomer = require('@accordproject/markdown-docx').CiceroMarkToOOXMLTransfomer;


Please follow the same naming convention as the other transformers. The name of the class is the transformation, while the methods indicate what is being transformed.

algomaster99 · 2021-07-05T14:56:04Z

The test will fail because the OOXML styles differ

What do you mean? You can configure the OOXML for acceptance of delivery in markdown-transform/packages/markdown-transform/test/transform.js. After that, you can test if it's converted correctly to CiceroMark parsed.

K-Kumar-01 · 2021-07-05T15:15:56Z

The test will fail because the OOXML styles differ

What do you mean? You can configure the OOXML for acceptance of delivery in markdown-transform/packages/markdown-transform/test/transform.js. After that, you can test if it's converted correctly to CiceroMark parsed.

I mean for the other way round, where we convert from CiceroMark->OOXML.

Also, regarding Dan's changes, we would need to change the naming. Currently,
Oomxltransformer: OOXML -> CiceroMark
CiceroMarkToOOXMLTransformer: CiceroMark->OOXML

I am thinking of keeping the former same and change. Earlier I thought of CiceroMarkTransforer but it already declared. So we would need to think of other.
NOTE
According to the docstring
HTMLTransformer: Converts a CiceroMark or CommonMark DOM to HTML.
PdfTransformer: Converts a PDF to CiceroMark DOM

Though, both the functions carry A<->B transformation.

dselman · 2021-07-05T15:26:59Z

The test will fail because the OOXML styles differ

What do you mean? You can configure the OOXML for acceptance of delivery in markdown-transform/packages/markdown-transform/test/transform.js. After that, you can test if it's converted correctly to CiceroMark parsed.

I mean for the other way round, where we convert from CiceroMark->OOXML.

Also, regarding Dan's changes, we would need to change the naming. Currently,
Oomxltransformer: OOXML -> CiceroMark
CiceroMarkToOOXMLTransformer: CiceroMark->OOXML

I am thinking of keeping the former same and change the latter to CiceroMarkTransformer. Will it work fine?

NOTE
According to the docstring
HTMLTransformer: Converts a CiceroMark or CommonMark DOM to HTML.
PdfTransformer: Converts a PDF to CiceroMark DOM

Though, both the functions carry A<->B transformation.

Yes, we want a single class (OOXMLTransformer?) that can perform OOXML <-> CiceroMark via 2 separate methods. This is the pattern we have used for all the other transformers.

E.g. https://github.com/accordproject/markdown-transform/blob/master/packages/markdown-html/src/HtmlTransformer.js#L40

And:

https://github.com/accordproject/markdown-transform/blob/master/packages/markdown-html/src/HtmlTransformer.js#L60

K-Kumar-01 · 2021-07-05T15:30:39Z

The test will fail because the OOXML styles differ

What do you mean? You can configure the OOXML for acceptance of delivery in markdown-transform/packages/markdown-transform/test/transform.js. After that, you can test if it's converted correctly to CiceroMark parsed.

I mean for the other way round, where we convert from CiceroMark->OOXML.
Also, regarding Dan's changes, we would need to change the naming. Currently,
Oomxltransformer: OOXML -> CiceroMark
CiceroMarkToOOXMLTransformer: CiceroMark->OOXML
I am thinking of keeping the former same and change the latter to CiceroMarkTransformer. Will it work fine?
NOTE
According to the docstring
HTMLTransformer: Converts a CiceroMark or CommonMark DOM to HTML.
PdfTransformer: Converts a PDF to CiceroMark DOM
Though, both the functions carry A<->B transformation.

Yes, we want a single class (OOXMLTransformer?) that can perform OOXML <-> CiceroMark via 2 separate methods. This is the pattern we have used for all the other transformers.

E.g. https://github.com/accordproject/markdown-transform/blob/master/packages/markdown-html/src/HtmlTransformer.js#L40

And:

https://github.com/accordproject/markdown-transform/blob/master/packages/markdown-html/src/HtmlTransformer.js#L60

@dselman
It would cause a single file to consist of ~450 lines(as it stands and more later). If there is no issue in that, then I will do the
refactoring in a separate PR.

Your thoughts, @algomaster99

algomaster99 · 2021-07-05T15:46:48Z

Your thoughts, @algomaster99

Always in for making the codebase more consistent.

dselman · 2021-07-05T15:55:57Z

Your thoughts, @algomaster99

Always in for making the codebase more consistent.

If you check some of the other transforms you will see that the single public (exported) class delegates to visitor classes etc. for the different transforms to keep the code modular and easier to maintain.

K-Kumar-01 · 2021-07-09T15:35:28Z

@dselman @algomaster99
I am facing an error while I was writing the test for markdown-cli.

I have imported the files above similarly in the markdown-transform also and the tests there pass.

dselman · 2021-07-09T16:20:24Z

@dselman @algomaster99
I am facing an error while I was writing the test for markdown-cli.

I have imported the files above similarly in the markdown-transform also and the tests there pass.

My guess is that your input OOXML is being used as a file path. Note the ENAMETOOLONG error.

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

K-Kumar-01 · 2021-07-13T05:12:56Z

@dselman @algomaster99
Requesting your review

dselman

Getting close! Couple of questions...

dselman · 2021-07-13T16:57:38Z

        },
+        ooxml: (input) => {
+            const t = new OOXMLTransformer();
+            return t.toOOXML(input).ooxml;


Do we need the ooxml child attribute here? The other transformations just return the output.

@dselman
Yes, actually OOXMLTransfomer returns {ooxml, variableFrequency}.

What is variable frequency and why does the caller need it?

It returns the count of different variables present in a template.
We need it because there is a requirement to provide a unique identifier to rich text input in MS-WORD which is generated using this.

What is variable frequency and why does the caller need it?

MS Word would require it for rendering all the variables.

Generating a unique idea is not the problem but getting back to attach event listeners is. Moreover, we want to keep this ID as close as possible to the actual variable name as they are means to inform the client which variable they are interacting with.

Shipper1 is the title and if someone edits, it would mean that they want to modify the shipper variable.

I am looking for ways to make us not require to export the counter variable. But we shall use it for OOXML generation as we want to make the title of the content control unique.

@K-Kumar-01 I think you can remove exporting of counter. I can demonstrate how we can do without it in the next meeting.

@algomaster99
That would need refactoring first then. I will create a separate PR that will deal with the above.
Once that is done, I will merge it.
Since, currently only windows are available on my pc it might take some time to close these PRs.

PS: HDD corruption made Linux vanish 😢. Will reinstall it.

Sure. No problem.

dselman · 2021-07-13T16:58:27Z

            return t.toCiceroMark(input, options);
        },
    },
+    ooxml: {


Don't you also need to register the ciceromark_parsed -> ooxml transform?

The transformer is converting both ooxml<->Ciceromark

My point is it looks like you only registered the ciceromark_parsed -> ooxml transformation. You will need to add a similar block to do the reverse, under a new ooxml block.

Please check that you can transform from a markdown file to ooxml and the reverse by running the CLI.

@dselman
AFAIR, I was able to convert both transformations ooxml<->ciceromark_parsed when I pushed the commit. However, I will recheck and let you know ASAP.

@dselman
Apologies for the delay. I have checked and it converts CiceroMark_parsed <-> OOXML. So, this feature is as we expected.

algomaster99 · 2021-07-16T16:36:44Z

+describe('markdown-cli (ooxml)', () => {
+    // Partial Acceptance test
+    const inputOOXMLFile = path.resolve(__dirname, 'data/acceptance', 'partial_acceptance.xml');
+    const expectedCiceroMarkParsed = JSON.parse(fs.readFileSync(path.resolve(__dirname, 'data/acceptance', 'partial_acceptance.json'), 'utf8'));
+
+    describe('#parse', () => {
+        it('should generate a ciceromark_parsed from ooxml', async () => {
+            const result = await Commands.transform(inputOOXMLFile, 'ooxml', [], 'ciceromark_parsed', null, {}, {});
+            JSON.parse(result).should.deep.equal(expectedCiceroMarkParsed);
+        });
+    });
+});


You can probably check for roundtrip by passing the roundtrip: true option. Check example.

algomaster99 · 2021-07-16T16:37:49Z

        },
+        ooxml: (input) => {
+            const t = new OOXMLTransformer();
+            return t.toOOXML(input).ooxml;


What is variable frequency and why does the caller need it?

MS Word would require it for rendering all the variables.

algomaster99 · 2021-07-16T16:40:26Z

 });
+
+describe('#ooxml', () => {
+    it('ooxml->ciceromark_parsed', async () => {


Spaces before and after ->.

algomaster99 · 2021-07-18T11:26:42Z

@K-Kumar-01 Did you get time to look at the changes I suggested?

K-Kumar-01 · 2021-07-18T11:33:14Z

@K-Kumar-01 Did you get time to look at the changes I suggested?

@algomaster99
I am sorry it slipped my mind. I will push the changes surely by today.

Roundtrip check for test in cli Remove the json file Test statement changes Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

algomaster99

Looks good to me. The tests are failing though, and the reason is not #422 .

K-Kumar-01 · 2021-07-18T14:17:07Z

Looks good to me. The tests are failing though, and the reason is not #422 .

I have the reason. I deleted the json for the wrong package, mistakenly. 😅
I didn't re-run the test as it was passing.
I will update the PR ASAP and then we can merge it 😀

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

algomaster99

I have another general suggestion about the naming of the test file and I know it is coming quite in this review process. Partial acceptance sounds like that your test is paritally correct. I think omitted-acceptance-of-delivery would be a better name.

@dselman I think @K-Kumar-01 is done with this. Could you look at it again?

algomaster99 · 2021-07-19T11:55:20Z

+        it('should roundtrip ciceromark_parsed <-> ooxml', async () => {
+            const result = await Commands.transform(inputOOXMLFile, 'ooxml', [], 'ciceromark_parsed', null, {}, { roundtrip: true });
+            result.should.equal(fs.readFileSync(inputOOXMLFile, 'utf-8'));
+        });


I think you would also need to check for going from cicero mark to OOXML. And for that, you would need to create a JSON file partial_acceptance.json.

@algomaster99 Will update the filenames and tests ASAP.

I think you would also need to check for going from cicero mark to OOXML. And for that, you would need to create a JSON file partial_acceptance.json.

@algomaster99
Regarding this, we already check for the conversion test so why do we do this one. roundtrip: true also pass so I am not able to understand the need to add this.
In transform.js, we have already done the test for both.

Regarding this, we already check for the conversion test so why do we do this one. roundtrip: true also pass so I am not able to understand the need to add this.

We aren't checking for equality for ciceromark_parsed in between so that's why it's needed.

In transform.js, we have already done the test for both.

Yeah, I am aware. I even pointed this out in the meeting. With that logic, I think we can completely remove any testing related to this PR from markdown-cli.

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

K-Kumar-01 · 2021-07-23T11:58:11Z

@dselman @algomaster99

I am not understanding why the test fails.
Both the string seems to be identical to me.
Can any one you help me here. After this is resolved, this PR will get ready to be merged.

algomaster99 · 2021-07-23T16:08:03Z

@K-Kumar-01 It's difficult to find out this way. You can use diff editors or just run diff over them to understand the difference.

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

Signed-off-by: k-kumar-01 <kushalkumargupta4@gmail.com>

add json file for test update the transform function Signed-off-by: k-kumar-01 <kushalkumargupta4@gmail.com>

algomaster99

One minor change and then we can merge.

Also, you haven't merged the latest changes correctly and that's why changes related to thematic break, etc are appearing in this PR.

algomaster99 · 2021-07-26T14:59:14Z

+const partialAcceptanceXML = fs.readFileSync(path.resolve(__dirname, 'data/acceptance', 'omitted-acceptance-of-delivery.xml'), 'utf8');
+const partialAcceptanceCiceroMarkParsed = JSON.parse(fs.readFileSync(path.resolve(__dirname, 'data/acceptance', 'omitted-acceptance-of-delivery.json'), 'utf8'));


If you have changed to omitted above, do those changes here as well.

@algomaster99
Done

Signed-off-by: k-kumar-01 <kushalkumargupta4@gmail.com>

algomaster99

@dselman I think this is done. Could you have a final look and if you think alike, go ahead with squash and merge?

K-Kumar-01 requested review from algomaster99 and dselman July 4, 2021 11:04

algomaster99 suggested changes Jul 4, 2021

View reviewed changes

dselman requested changes Jul 5, 2021

View reviewed changes

K-Kumar-01 added 3 commits July 10, 2021 02:51

feat(markdown-transform): add transformation CiceroMark<->OOXML

a1b7f42

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

fix(markdown-transform): conversion OOXML<->CiceroMark

aa48bff

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

tests: add test for ciceroMark and ooxml

a15508f

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

K-Kumar-01 force-pushed the k-kumar-01/markdown-cli-integration branch from 852dadc to a15508f Compare July 9, 2021 21:22

K-Kumar-01 requested review from algomaster99 and dselman July 9, 2021 21:23

dselman requested changes Jul 13, 2021

View reviewed changes

algomaster99 suggested changes Jul 16, 2021

View reviewed changes

tests: roundtrip check, remove json file

fed7036

Roundtrip check for test in cli Remove the json file Test statement changes Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

K-Kumar-01 requested review from algomaster99 and dselman July 18, 2021 12:51

algomaster99 approved these changes Jul 18, 2021

View reviewed changes

test: change partial_acceptance.json dir

c9b3fc3

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

algomaster99 suggested changes Jul 19, 2021

View reviewed changes

rename files and file import

f98a43e

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

K-Kumar-01 mentioned this pull request Jul 23, 2021

refactor(markdown-docx): remove return variable frequency api #429

Merged

5 tasks

K-Kumar-01 and others added 3 commits July 26, 2021 17:28

feat(markdown-docx): add thematic-break transformer - #397 (#428)

8dec9d1

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>

refactor(markdown-docx): remove return variable frequency api (#429)

bafffd4

Signed-off-by: k-kumar-01 <kushalkumargupta4@gmail.com>

test(markdown-cli): add roundtrip test for both side conversions

0b28424

add json file for test update the transform function Signed-off-by: k-kumar-01 <kushalkumargupta4@gmail.com>

K-Kumar-01 requested a review from algomaster99 July 26, 2021 12:02

algomaster99 suggested changes Jul 26, 2021

View reviewed changes

refactor: rename file imports

25e3085

Signed-off-by: k-kumar-01 <kushalkumargupta4@gmail.com>

K-Kumar-01 requested a review from algomaster99 July 26, 2021 16:22

algomaster99 approved these changes Jul 26, 2021

View reviewed changes

algomaster99 merged commit 2b617d1 into algoo-ooxml Jul 30, 2021

algomaster99 deleted the k-kumar-01/markdown-cli-integration branch July 30, 2021 10:01

		const partialAcceptanceXML = fs.readFileSync(path.resolve(__dirname, 'data/acceptance', 'omitted-acceptance-of-delivery.xml'), 'utf8');
		const partialAcceptanceCiceroMarkParsed = JSON.parse(fs.readFileSync(path.resolve(__dirname, 'data/acceptance', 'omitted-acceptance-of-delivery.json'), 'utf8'));

Conversation

K-Kumar-01 commented Jul 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Author Checklist

Uh oh!

K-Kumar-01 commented Jul 4, 2021

Uh oh!

algomaster99 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

algomaster99 commented Jul 4, 2021

Uh oh!

K-Kumar-01 commented Jul 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

algomaster99 commented Jul 5, 2021

Uh oh!

K-Kumar-01 commented Jul 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dselman commented Jul 5, 2021

Uh oh!

K-Kumar-01 commented Jul 5, 2021

Uh oh!

algomaster99 commented Jul 5, 2021

Uh oh!

dselman commented Jul 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

K-Kumar-01 commented Jul 9, 2021

Uh oh!

dselman commented Jul 9, 2021

Uh oh!

K-Kumar-01 commented Jul 13, 2021

Uh oh!

dselman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

algomaster99 Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

K-Kumar-01 commented Jul 4, 2021 •

edited

Loading

K-Kumar-01 commented Jul 5, 2021 •

edited

Loading

K-Kumar-01 commented Jul 5, 2021 •

edited

Loading

dselman commented Jul 5, 2021 •

edited

Loading

algomaster99 Jul 20, 2021 •

edited

Loading