Skip to content

feat(markdown-transform): add transformation CiceroMark<->OOXML#424

Merged
algomaster99 merged 10 commits intoalgoo-ooxmlfrom
k-kumar-01/markdown-cli-integration
Jul 30, 2021
Merged

feat(markdown-transform): add transformation CiceroMark<->OOXML#424
algomaster99 merged 10 commits intoalgoo-ooxmlfrom
k-kumar-01/markdown-cli-integration

Conversation

@K-Kumar-01
Copy link
Copy Markdown
Collaborator

@K-Kumar-01 K-Kumar-01 commented Jul 4, 2021

Signed-off-by: K-Kumar-01 kushalkumargupta4@gmail.com

Adds the transformation CiceroMark<->OOXML in the transformation graph.

Changes

  • Add the transformer to convert OOXML->CiceroMark
  • Add the transformer to convert CiceroMark->OOXML.

Author Checklist

  • Ensure you provide a DCO sign-off for your commits using the --signoff option of git commit.
  • Vital features and changes captured in unit and/or integration tests
  • Commits messages follow AP format
  • Extend the documentation, if necessary
  • Merging to master from fork:branchname

@K-Kumar-01
Copy link
Copy Markdown
Collaborator Author

@dselman @algomaster99
I have created a PR to integrate the CiceroMark<->OOXML in markdown-cli.
I was not sure how to test it.

Also, the build fails due to a test. Merging PR will fix this.

Copy link
Copy Markdown
Contributor

@algomaster99 algomaster99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped the comments. Did you even try running the CLI with your changes? It was broken both ways (CiceroMark <-> OOXML) as I have elaborated in the two comments below.

},
ooxml: {
docs: 'OOXML',
fileFormat: 'binary',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not binary. It should be utf-8 only. If you read the output after executing ./packages/markdown-cli/index.js transform --from ciceromark_parsed --to ooxml --input input.json, you will just get <binary data> in the console. Moreover, you can even read and understand what's written in the file so binary makes no sense here.

docs: 'OOXML',
fileFormat: 'binary',
ciceromark_parsed: async(input) => {
return OoxmlTransformer.toCiceroMark(input);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would throw an error because toCiceroMark is not a static method.

@algomaster99
Copy link
Copy Markdown
Contributor

I was not sure how to test it.

Try taking inspiration from markdown-html.

@K-Kumar-01
Copy link
Copy Markdown
Collaborator Author

K-Kumar-01 commented Jul 5, 2021

@algomaster99
Had a doubt regarding CiceroMarkJSON->OOXML.
The test will fail because the OOXML styles differ. Shall I first generate an OOXML using the transformer first and then check?

Another approach for this can be to convert the OOXML to JSON using xmljs and check using deep.equal().

Which approach should I go for?

const PdfTransformer = require('@accordproject/markdown-pdf').PdfTransformer;
const DocxTransformer = require('@accordproject/markdown-docx').DocxTransformer;
const OoxmlTransformer = require('@accordproject/markdown-docx').OoxmlTransformer;
const CiceroMarkToOOXMLTransfomer = require('@accordproject/markdown-docx').CiceroMarkToOOXMLTransfomer;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the same naming convention as the other transformers. The name of the class is the transformation, while the methods indicate what is being transformed.

@algomaster99
Copy link
Copy Markdown
Contributor

The test will fail because the OOXML styles differ

What do you mean? You can configure the OOXML for acceptance of delivery in markdown-transform/packages/markdown-transform/test/transform.js. After that, you can test if it's converted correctly to CiceroMark parsed.

@K-Kumar-01
Copy link
Copy Markdown
Collaborator Author

K-Kumar-01 commented Jul 5, 2021

The test will fail because the OOXML styles differ

What do you mean? You can configure the OOXML for acceptance of delivery in markdown-transform/packages/markdown-transform/test/transform.js. After that, you can test if it's converted correctly to CiceroMark parsed.

I mean for the other way round, where we convert from CiceroMark->OOXML.

Also, regarding Dan's changes, we would need to change the naming. Currently,
Oomxltransformer: OOXML -> CiceroMark
CiceroMarkToOOXMLTransformer: CiceroMark->OOXML

I am thinking of keeping the former same and change. Earlier I thought of CiceroMarkTransforer but it already declared. So we would need to think of other.
NOTE
According to the docstring
HTMLTransformer: Converts a CiceroMark or CommonMark DOM to HTML.
PdfTransformer: Converts a PDF to CiceroMark DOM

Though, both the functions carry A<->B transformation.

@dselman
Copy link
Copy Markdown
Contributor

dselman commented Jul 5, 2021

The test will fail because the OOXML styles differ

What do you mean? You can configure the OOXML for acceptance of delivery in markdown-transform/packages/markdown-transform/test/transform.js. After that, you can test if it's converted correctly to CiceroMark parsed.

I mean for the other way round, where we convert from CiceroMark->OOXML.

Also, regarding Dan's changes, we would need to change the naming. Currently,
Oomxltransformer: OOXML -> CiceroMark
CiceroMarkToOOXMLTransformer: CiceroMark->OOXML

I am thinking of keeping the former same and change the latter to CiceroMarkTransformer. Will it work fine?

NOTE
According to the docstring
HTMLTransformer: Converts a CiceroMark or CommonMark DOM to HTML.
PdfTransformer: Converts a PDF to CiceroMark DOM

Though, both the functions carry A<->B transformation.

Yes, we want a single class (OOXMLTransformer?) that can perform OOXML <-> CiceroMark via 2 separate methods. This is the pattern we have used for all the other transformers.

E.g. https://github.com/accordproject/markdown-transform/blob/master/packages/markdown-html/src/HtmlTransformer.js#L40

And:

https://github.com/accordproject/markdown-transform/blob/master/packages/markdown-html/src/HtmlTransformer.js#L60

@K-Kumar-01
Copy link
Copy Markdown
Collaborator Author

The test will fail because the OOXML styles differ

What do you mean? You can configure the OOXML for acceptance of delivery in markdown-transform/packages/markdown-transform/test/transform.js. After that, you can test if it's converted correctly to CiceroMark parsed.

I mean for the other way round, where we convert from CiceroMark->OOXML.
Also, regarding Dan's changes, we would need to change the naming. Currently,
Oomxltransformer: OOXML -> CiceroMark
CiceroMarkToOOXMLTransformer: CiceroMark->OOXML
I am thinking of keeping the former same and change the latter to CiceroMarkTransformer. Will it work fine?
NOTE
According to the docstring
HTMLTransformer: Converts a CiceroMark or CommonMark DOM to HTML.
PdfTransformer: Converts a PDF to CiceroMark DOM
Though, both the functions carry A<->B transformation.

Yes, we want a single class (OOXMLTransformer?) that can perform OOXML <-> CiceroMark via 2 separate methods. This is the pattern we have used for all the other transformers.

E.g. https://github.com/accordproject/markdown-transform/blob/master/packages/markdown-html/src/HtmlTransformer.js#L40

And:

https://github.com/accordproject/markdown-transform/blob/master/packages/markdown-html/src/HtmlTransformer.js#L60

@dselman
It would cause a single file to consist of ~450 lines(as it stands and more later). If there is no issue in that, then I will do the
refactoring in a separate PR.

Your thoughts, @algomaster99

@algomaster99
Copy link
Copy Markdown
Contributor

Your thoughts, @algomaster99

Always in for making the codebase more consistent.

@dselman
Copy link
Copy Markdown
Contributor

dselman commented Jul 5, 2021

Your thoughts, @algomaster99

Always in for making the codebase more consistent.

If you check some of the other transforms you will see that the single public (exported) class delegates to visitor classes etc. for the different transforms to keep the code modular and easier to maintain.

@K-Kumar-01
Copy link
Copy Markdown
Collaborator Author

@dselman @algomaster99
I am facing an error while I was writing the test for markdown-cli.
Screenshot from 2021-07-09 20-59-29
I have imported the files above similarly in the markdown-transform also and the tests there pass.

@dselman
Copy link
Copy Markdown
Contributor

dselman commented Jul 9, 2021

@dselman @algomaster99
I am facing an error while I was writing the test for markdown-cli.
Screenshot from 2021-07-09 20-59-29
I have imported the files above similarly in the markdown-transform also and the tests there pass.

My guess is that your input OOXML is being used as a file path. Note the ENAMETOOLONG error.

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>
Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>
Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>
@K-Kumar-01 K-Kumar-01 force-pushed the k-kumar-01/markdown-cli-integration branch from 852dadc to a15508f Compare July 9, 2021 21:22
@K-Kumar-01
Copy link
Copy Markdown
Collaborator Author

@dselman @algomaster99
Requesting your review

Copy link
Copy Markdown
Contributor

@dselman dselman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting close! Couple of questions...

},
ooxml: (input) => {
const t = new OOXMLTransformer();
return t.toOOXML(input).ooxml;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the ooxml child attribute here? The other transformations just return the output.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dselman
Yes, actually OOXMLTransfomer returns {ooxml, variableFrequency}.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is variable frequency and why does the caller need it?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns the count of different variables present in a template.
We need it because there is a requirement to provide a unique identifier to rich text input in MS-WORD which is generated using this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is variable frequency and why does the caller need it?

MS Word would require it for rendering all the variables.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating a unique idea is not the problem but getting back to attach event listeners is. Moreover, we want to keep this ID as close as possible to the actual variable name as they are means to inform the client which variable they are interacting with.

cc

Shipper1 is the title and if someone edits, it would mean that they want to modify the shipper variable.

Copy link
Copy Markdown
Contributor

@algomaster99 algomaster99 Jul 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am looking for ways to make us not require to export the counter variable. But we shall use it for OOXML generation as we want to make the title of the content control unique.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@K-Kumar-01 I think you can remove exporting of counter. I can demonstrate how we can do without it in the next meeting.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algomaster99
That would need refactoring first then. I will create a separate PR that will deal with the above.
Once that is done, I will merge it.
Since, currently only windows are available on my pc it might take some time to close these PRs.

PS: HDD corruption made Linux vanish 😢. Will reinstall it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. No problem.

return t.toCiceroMark(input, options);
},
},
ooxml: {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you also need to register the ciceromark_parsed -> ooxml transform?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transformer is converting both ooxml<->Ciceromark

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is it looks like you only registered the ciceromark_parsed -> ooxml transformation. You will need to add a similar block to do the reverse, under a new ooxml block.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check that you can transform from a markdown file to ooxml and the reverse by running the CLI.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dselman
AFAIR, I was able to convert both transformations ooxml<->ciceromark_parsed when I pushed the commit. However, I will recheck and let you know ASAP.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dselman
Apologies for the delay. I have checked and it converts CiceroMark_parsed <-> OOXML. So, this feature is as we expected.

Comment on lines +152 to +163
describe('markdown-cli (ooxml)', () => {
// Partial Acceptance test
const inputOOXMLFile = path.resolve(__dirname, 'data/acceptance', 'partial_acceptance.xml');
const expectedCiceroMarkParsed = JSON.parse(fs.readFileSync(path.resolve(__dirname, 'data/acceptance', 'partial_acceptance.json'), 'utf8'));

describe('#parse', () => {
it('should generate a ciceromark_parsed from ooxml', async () => {
const result = await Commands.transform(inputOOXMLFile, 'ooxml', [], 'ciceromark_parsed', null, {}, {});
JSON.parse(result).should.deep.equal(expectedCiceroMarkParsed);
});
});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably check for roundtrip by passing the roundtrip: true option. Check example.

},
ooxml: (input) => {
const t = new OOXMLTransformer();
return t.toOOXML(input).ooxml;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is variable frequency and why does the caller need it?

MS Word would require it for rendering all the variables.

});

describe('#ooxml', () => {
it('ooxml->ciceromark_parsed', async () => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces before and after ->.

@algomaster99
Copy link
Copy Markdown
Contributor

@K-Kumar-01 Did you get time to look at the changes I suggested?

@K-Kumar-01
Copy link
Copy Markdown
Collaborator Author

@K-Kumar-01 Did you get time to look at the changes I suggested?

@algomaster99
I am sorry it slipped my mind. I will push the changes surely by today.

Roundtrip check for test in cli
Remove the json file
Test statement changes

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>
Copy link
Copy Markdown
Contributor

@algomaster99 algomaster99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The tests are failing though, and the reason is not #422 .

@K-Kumar-01
Copy link
Copy Markdown
Collaborator Author

Looks good to me. The tests are failing though, and the reason is not #422 .

I have the reason. I deleted the json for the wrong package, mistakenly. 😅
I didn't re-run the test as it was passing.
I will update the PR ASAP and then we can merge it 😀

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>
Copy link
Copy Markdown
Contributor

@algomaster99 algomaster99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another general suggestion about the naming of the test file and I know it is coming quite in this review process. Partial acceptance sounds like that your test is paritally correct. I think omitted-acceptance-of-delivery would be a better name.

@dselman I think @K-Kumar-01 is done with this. Could you look at it again?

Comment thread packages/markdown-cli/test/cli.js Outdated
Comment on lines +157 to +160
it('should roundtrip ciceromark_parsed <-> ooxml', async () => {
const result = await Commands.transform(inputOOXMLFile, 'ooxml', [], 'ciceromark_parsed', null, {}, { roundtrip: true });
result.should.equal(fs.readFileSync(inputOOXMLFile, 'utf-8'));
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would also need to check for going from cicero mark to OOXML. And for that, you would need to create a JSON file partial_acceptance.json.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algomaster99 Will update the filenames and tests ASAP.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would also need to check for going from cicero mark to OOXML. And for that, you would need to create a JSON file partial_acceptance.json.

@algomaster99
Regarding this, we already check for the conversion test so why do we do this one. roundtrip: true also pass so I am not able to understand the need to add this.
In transform.js, we have already done the test for both.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding this, we already check for the conversion test so why do we do this one. roundtrip: true also pass so I am not able to understand the need to add this.

We aren't checking for equality for ciceromark_parsed in between so that's why it's needed.

In transform.js, we have already done the test for both.

Yeah, I am aware. I even pointed this out in the meeting. With that logic, I think we can completely remove any testing related to this PR from markdown-cli.

Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>
@K-Kumar-01
Copy link
Copy Markdown
Collaborator Author

@dselman @algomaster99
Screenshot from 2021-07-23 17-24-52
Screenshot from 2021-07-23 17-25-36
Screenshot from 2021-07-23 17-25-40
Screenshot from 2021-07-23 17-25-41

I am not understanding why the test fails.
Both the string seems to be identical to me.
Can any one you help me here. After this is resolved, this PR will get ready to be merged.

@algomaster99
Copy link
Copy Markdown
Contributor

@K-Kumar-01 It's difficult to find out this way. You can use diff editors or just run diff over them to understand the difference.

K-Kumar-01 and others added 3 commits July 26, 2021 17:28
Signed-off-by: K-Kumar-01 <kushalkumargupta4@gmail.com>
Signed-off-by: k-kumar-01 <kushalkumargupta4@gmail.com>
add json file for test
update the transform function

Signed-off-by: k-kumar-01 <kushalkumargupta4@gmail.com>
@K-Kumar-01 K-Kumar-01 requested a review from algomaster99 July 26, 2021 12:02
Copy link
Copy Markdown
Contributor

@algomaster99 algomaster99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor change and then we can merge.

Also, you haven't merged the latest changes correctly and that's why changes related to thematic break, etc are appearing in this PR.

Comment on lines +58 to +59
const partialAcceptanceXML = fs.readFileSync(path.resolve(__dirname, 'data/acceptance', 'omitted-acceptance-of-delivery.xml'), 'utf8');
const partialAcceptanceCiceroMarkParsed = JSON.parse(fs.readFileSync(path.resolve(__dirname, 'data/acceptance', 'omitted-acceptance-of-delivery.json'), 'utf8'));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have changed to omitted above, do those changes here as well.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: k-kumar-01 <kushalkumargupta4@gmail.com>
@K-Kumar-01 K-Kumar-01 requested a review from algomaster99 July 26, 2021 16:22
Copy link
Copy Markdown
Contributor

@algomaster99 algomaster99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dselman I think this is done. Could you have a final look and if you think alike, go ahead with squash and merge?

@algomaster99 algomaster99 merged commit 2b617d1 into algoo-ooxml Jul 30, 2021
@algomaster99 algomaster99 deleted the k-kumar-01/markdown-cli-integration branch July 30, 2021 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants