End of line converter

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
6
down vote

favorite

I recently wrote a python script to convert multiple files EOL's from Unix to DOS and vice versa.

I am looking for tips to improve my code or if there is a better way of doing something that I have missed.

#!/usr/bin/env python3

import sys

def main():
 command, *filenames = sys.argv[1:]
 valid_commands = ['-d', '-u']
 sys.tracebacklimit = None

 if not command in valid_commands:
 error = """'command'
 Provide the following arguments -u|d file [file2] [file3] ...
 flags:
 -u : converts DOS to UNIX
 -d : converts UNIX to DOS
 example command:
 ./eol -u foo.py bar.py""".format(command=command)

 raise ValueError(error)
 sys.exit(1)

 if filenames:
 convert(filenames, command)
 else:
 print("> no files to convert")

def convert(files, command):
 for file in files:
 text = open(file, 'r').read()

 with open(file, 'w') as current:
 if command == '-u':
 format = 'UNIX'
 current.write(text.replace('rn', 'n'))
 elif command == '-d':
 format = 'DOS'
 current.write(text.replace('n', 'rn'))

 print("> converting file filename to format ...".format(
 filename=file, format=format))

if __name__ == "__main__":
 main()

edited Jan 27 at 20:12

200_success

123k14143401

asked Jan 27 at 16:32

nyvokub

1334

add a commentÂ |Â

up vote
6
down vote

favorite

I recently wrote a python script to convert multiple files EOL's from Unix to DOS and vice versa.

I am looking for tips to improve my code or if there is a better way of doing something that I have missed.

#!/usr/bin/env python3

import sys

def main():
 command, *filenames = sys.argv[1:]
 valid_commands = ['-d', '-u']
 sys.tracebacklimit = None

 if not command in valid_commands:
 error = """'command'
 Provide the following arguments -u|d file [file2] [file3] ...
 flags:
 -u : converts DOS to UNIX
 -d : converts UNIX to DOS
 example command:
 ./eol -u foo.py bar.py""".format(command=command)

 raise ValueError(error)
 sys.exit(1)

 if filenames:
 convert(filenames, command)
 else:
 print("> no files to convert")

def convert(files, command):
 for file in files:
 text = open(file, 'r').read()

 with open(file, 'w') as current:
 if command == '-u':
 format = 'UNIX'
 current.write(text.replace('rn', 'n'))
 elif command == '-d':
 format = 'DOS'
 current.write(text.replace('n', 'rn'))

 print("> converting file filename to format ...".format(
 filename=file, format=format))

if __name__ == "__main__":
 main()

edited Jan 27 at 20:12

200_success

123k14143401

asked Jan 27 at 16:32

nyvokub

1334

add a commentÂ |Â

up vote
6
down vote

favorite

I recently wrote a python script to convert multiple files EOL's from Unix to DOS and vice versa.

I am looking for tips to improve my code or if there is a better way of doing something that I have missed.

#!/usr/bin/env python3

import sys

def main():
 command, *filenames = sys.argv[1:]
 valid_commands = ['-d', '-u']
 sys.tracebacklimit = None

 if not command in valid_commands:
 error = """'command'
 Provide the following arguments -u|d file [file2] [file3] ...
 flags:
 -u : converts DOS to UNIX
 -d : converts UNIX to DOS
 example command:
 ./eol -u foo.py bar.py""".format(command=command)

 raise ValueError(error)
 sys.exit(1)

 if filenames:
 convert(filenames, command)
 else:
 print("> no files to convert")

def convert(files, command):
 for file in files:
 text = open(file, 'r').read()

 with open(file, 'w') as current:
 if command == '-u':
 format = 'UNIX'
 current.write(text.replace('rn', 'n'))
 elif command == '-d':
 format = 'DOS'
 current.write(text.replace('n', 'rn'))

 print("> converting file filename to format ...".format(
 filename=file, format=format))

if __name__ == "__main__":
 main()

edited Jan 27 at 20:12

200_success

123k14143401

asked Jan 27 at 16:32

nyvokub

1334

I recently wrote a python script to convert multiple files EOL's from Unix to DOS and vice versa.

I am looking for tips to improve my code or if there is a better way of doing something that I have missed.

#!/usr/bin/env python3

import sys

def main():
 command, *filenames = sys.argv[1:]
 valid_commands = ['-d', '-u']
 sys.tracebacklimit = None

 if not command in valid_commands:
 error = """'command'
 Provide the following arguments -u|d file [file2] [file3] ...
 flags:
 -u : converts DOS to UNIX
 -d : converts UNIX to DOS
 example command:
 ./eol -u foo.py bar.py""".format(command=command)

 raise ValueError(error)
 sys.exit(1)

 if filenames:
 convert(filenames, command)
 else:
 print("> no files to convert")

def convert(files, command):
 for file in files:
 text = open(file, 'r').read()

 with open(file, 'w') as current:
 if command == '-u':
 format = 'UNIX'
 current.write(text.replace('rn', 'n'))
 elif command == '-d':
 format = 'DOS'
 current.write(text.replace('n', 'rn'))

 print("> converting file filename to format ...".format(
 filename=file, format=format))

if __name__ == "__main__":
 main()

edited Jan 27 at 20:12

200_success

123k14143401

asked Jan 27 at 16:32

nyvokub

1334

edited Jan 27 at 20:12

200_success

123k14143401

edited Jan 27 at 20:12

200_success

123k14143401

edited Jan 27 at 20:12

200_success

123k14143401

asked Jan 27 at 16:32

nyvokub

1334

asked Jan 27 at 16:32

nyvokub

1334

asked Jan 27 at 16:32

nyvokub

1334

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
2
down vote

accepted

A couple of small observations:

sys.exit(1) will never be reached so you can remove it. Apparently, you don't want to show, to whoever will be using your script, the traceback though that's not what I'd recommend. It's nice to know why and how the program failed (and even if you don't want a traceback, you can always create your own custom exception class):
```
class MyCustomException(Exception):
 pass
```
Which you can call like:
```
if bla_bla:
 raise MyCustomException('my message here')
```

format = 'UNIX' and format = 'DOS': they doesn't seem to be used anywhere else in the code so you can remove them.

Change if not command in valid_commands: to if command not in valid_commands:

Use two blank lines between your functions

Use argparse module to handle command line arguments

This: text = open(file, 'r').read() will load the whole file in memory which might be bad if you're applying your function on very large file. I'd recommend you process one line at a time, or at least call f.read(size). From the docs:

To read a fileÃ¢Â€Â™s contents, call f.read(size), which reads some
quantity of data and returns it as a string (in text mode) or bytes
object (in binary mode). size is an optional numeric argument. When
size is omitted or negative, the entire contents of the file will be
read and returned; itÃ¢Â€Â™s your problem if the file is twice as large as
your machineÃ¢Â€Â™s memory. Otherwise, at most size bytes are read and
returned.

edited Jan 27 at 18:00

answered Jan 27 at 17:42

Ã‘ÂÃ’Â¯ÃÂ…ÃÂº

6,99921754

great points, although i'm not sure if i'm understanding 2. correctly. I use the format variable at the end of the for loop which defines what the initial command was. If I get rid of them then print("> converting file filename to format ...".format( filename=file, format=format)) would break.
â€“Â nyvokub
Jan 28 at 4:48

add a commentÂ |Â

up vote
3
down vote

The code in the post does not work, because the files are opened in text mode, and in text mode Python 3 translates newlines by default. To quote the Python documentation:

newline controls how line endings are handled. It can be None, '', 'n', 'r', and 'rn'. It works as follows:

When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in 'n', 'r', or 'rn', and these are translated into 'n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.

When writing output to the stream, if newline is None, any 'n' characters written are translated to the system default line separator, os.linesep. If newline is '' or 'n', no translation takes place. If newline is any of the other legal values, any 'n' characters written are translated to the given string.

This means that the code in the post never gets to see the original line endings and so it does not behave as intended when run on Windows. (This makes me suspect that it has not been tested in all four configurations: Unix Ã¢Â†Â’ DOS on Unix; DOS Ã¢Â†Â’ Unix on Unix; Unix Ã¢Â†Â’ DOS on Windows; DOS Ã¢Â†Â’ Unix on Windows.)

In order to operate on the original line endings, you could open the file in binary mode (both for reading and writing), or open it in text mode but set newline='' so that newlines are not translated.

answered Jan 28 at 10:36

Gareth Rees

41.1k394168

Great point! I read up on binary mode but was not sure what the intended use for it was. Thank you for clarifying.
â€“Â nyvokub
Jan 28 at 15:01

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f186140%2fend-of-line-converter%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
2
down vote

accepted

A couple of small observations:

sys.exit(1) will never be reached so you can remove it. Apparently, you don't want to show, to whoever will be using your script, the traceback though that's not what I'd recommend. It's nice to know why and how the program failed (and even if you don't want a traceback, you can always create your own custom exception class):
```
class MyCustomException(Exception):
 pass
```
Which you can call like:
```
if bla_bla:
 raise MyCustomException('my message here')
```

format = 'UNIX' and format = 'DOS': they doesn't seem to be used anywhere else in the code so you can remove them.

Change if not command in valid_commands: to if command not in valid_commands:

Use two blank lines between your functions

Use argparse module to handle command line arguments

This: text = open(file, 'r').read() will load the whole file in memory which might be bad if you're applying your function on very large file. I'd recommend you process one line at a time, or at least call f.read(size). From the docs:

To read a fileÃ¢Â€Â™s contents, call f.read(size), which reads some
quantity of data and returns it as a string (in text mode) or bytes
object (in binary mode). size is an optional numeric argument. When
size is omitted or negative, the entire contents of the file will be
read and returned; itÃ¢Â€Â™s your problem if the file is twice as large as
your machineÃ¢Â€Â™s memory. Otherwise, at most size bytes are read and
returned.

edited Jan 27 at 18:00

answered Jan 27 at 17:42

Ã‘ÂÃ’Â¯ÃÂ…ÃÂº

6,99921754

great points, although i'm not sure if i'm understanding 2. correctly. I use the format variable at the end of the for loop which defines what the initial command was. If I get rid of them then print("> converting file filename to format ...".format( filename=file, format=format)) would break.
â€“Â nyvokub
Jan 28 at 4:48

add a commentÂ |Â

up vote
2
down vote

accepted

A couple of small observations:

sys.exit(1) will never be reached so you can remove it. Apparently, you don't want to show, to whoever will be using your script, the traceback though that's not what I'd recommend. It's nice to know why and how the program failed (and even if you don't want a traceback, you can always create your own custom exception class):
```
class MyCustomException(Exception):
 pass
```
Which you can call like:
```
if bla_bla:
 raise MyCustomException('my message here')
```

format = 'UNIX' and format = 'DOS': they doesn't seem to be used anywhere else in the code so you can remove them.

Change if not command in valid_commands: to if command not in valid_commands:

Use two blank lines between your functions

Use argparse module to handle command line arguments

This: text = open(file, 'r').read() will load the whole file in memory which might be bad if you're applying your function on very large file. I'd recommend you process one line at a time, or at least call f.read(size). From the docs:

To read a fileÃ¢Â€Â™s contents, call f.read(size), which reads some
quantity of data and returns it as a string (in text mode) or bytes
object (in binary mode). size is an optional numeric argument. When
size is omitted or negative, the entire contents of the file will be
read and returned; itÃ¢Â€Â™s your problem if the file is twice as large as
your machineÃ¢Â€Â™s memory. Otherwise, at most size bytes are read and
returned.

edited Jan 27 at 18:00

answered Jan 27 at 17:42

Ã‘ÂÃ’Â¯ÃÂ…ÃÂº

6,99921754

great points, although i'm not sure if i'm understanding 2. correctly. I use the format variable at the end of the for loop which defines what the initial command was. If I get rid of them then print("> converting file filename to format ...".format( filename=file, format=format)) would break.
â€“Â nyvokub
Jan 28 at 4:48

add a commentÂ |Â

up vote
2
down vote

accepted

A couple of small observations:

sys.exit(1) will never be reached so you can remove it. Apparently, you don't want to show, to whoever will be using your script, the traceback though that's not what I'd recommend. It's nice to know why and how the program failed (and even if you don't want a traceback, you can always create your own custom exception class):
```
class MyCustomException(Exception):
 pass
```
Which you can call like:
```
if bla_bla:
 raise MyCustomException('my message here')
```

format = 'UNIX' and format = 'DOS': they doesn't seem to be used anywhere else in the code so you can remove them.

Change if not command in valid_commands: to if command not in valid_commands:

Use two blank lines between your functions

Use argparse module to handle command line arguments

This: text = open(file, 'r').read() will load the whole file in memory which might be bad if you're applying your function on very large file. I'd recommend you process one line at a time, or at least call f.read(size). From the docs:

To read a fileÃ¢Â€Â™s contents, call f.read(size), which reads some
quantity of data and returns it as a string (in text mode) or bytes
object (in binary mode). size is an optional numeric argument. When
size is omitted or negative, the entire contents of the file will be
read and returned; itÃ¢Â€Â™s your problem if the file is twice as large as
your machineÃ¢Â€Â™s memory. Otherwise, at most size bytes are read and
returned.

edited Jan 27 at 18:00

answered Jan 27 at 17:42

Ã‘ÂÃ’Â¯ÃÂ…ÃÂº

6,99921754

A couple of small observations:

sys.exit(1) will never be reached so you can remove it. Apparently, you don't want to show, to whoever will be using your script, the traceback though that's not what I'd recommend. It's nice to know why and how the program failed (and even if you don't want a traceback, you can always create your own custom exception class):
```
class MyCustomException(Exception):
 pass
```
Which you can call like:
```
if bla_bla:
 raise MyCustomException('my message here')
```

format = 'UNIX' and format = 'DOS': they doesn't seem to be used anywhere else in the code so you can remove them.

Change if not command in valid_commands: to if command not in valid_commands:

Use two blank lines between your functions

Use argparse module to handle command line arguments

This: text = open(file, 'r').read() will load the whole file in memory which might be bad if you're applying your function on very large file. I'd recommend you process one line at a time, or at least call f.read(size). From the docs:

To read a fileÃ¢Â€Â™s contents, call f.read(size), which reads some
quantity of data and returns it as a string (in text mode) or bytes
object (in binary mode). size is an optional numeric argument. When
size is omitted or negative, the entire contents of the file will be
read and returned; itÃ¢Â€Â™s your problem if the file is twice as large as
your machineÃ¢Â€Â™s memory. Otherwise, at most size bytes are read and
returned.

edited Jan 27 at 18:00

answered Jan 27 at 17:42

Ã‘ÂÃ’Â¯ÃÂ…ÃÂº

6,99921754

edited Jan 27 at 18:00

answered Jan 27 at 17:42

Ã‘ÂÃ’Â¯ÃÂ…ÃÂº

6,99921754

answered Jan 27 at 17:42

Ã‘ÂÃ’Â¯ÃÂ…ÃÂº

6,99921754

answered Jan 27 at 17:42

Ã‘ÂÃ’Â¯ÃÂ…ÃÂº

6,99921754

great points, although i'm not sure if i'm understanding 2. correctly. I use the format variable at the end of the for loop which defines what the initial command was. If I get rid of them then print("> converting file filename to format ...".format( filename=file, format=format)) would break.
â€“Â nyvokub
Jan 28 at 4:48

add a commentÂ |Â

great points, although i'm not sure if i'm understanding 2. correctly. I use the format variable at the end of the for loop which defines what the initial command was. If I get rid of them then print("> converting file filename to format ...".format( filename=file, format=format)) would break.
â€“Â nyvokub
Jan 28 at 4:48

great points, although i'm not sure if i'm understanding 2. correctly. I use the format variable at the end of the for loop which defines what the initial command was. If I get rid of them then print("> converting file filename to format ...".format( filename=file, format=format)) would break.
â€“Â nyvokub
Jan 28 at 4:48

add a commentÂ |Â

up vote
3
down vote

The code in the post does not work, because the files are opened in text mode, and in text mode Python 3 translates newlines by default. To quote the Python documentation:

newline controls how line endings are handled. It can be None, '', 'n', 'r', and 'rn'. It works as follows:

When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in 'n', 'r', or 'rn', and these are translated into 'n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.

When writing output to the stream, if newline is None, any 'n' characters written are translated to the system default line separator, os.linesep. If newline is '' or 'n', no translation takes place. If newline is any of the other legal values, any 'n' characters written are translated to the given string.

answered Jan 28 at 10:36

Gareth Rees

41.1k394168

Great point! I read up on binary mode but was not sure what the intended use for it was. Thank you for clarifying.
â€“Â nyvokub
Jan 28 at 15:01

add a commentÂ |Â

up vote
3
down vote

The code in the post does not work, because the files are opened in text mode, and in text mode Python 3 translates newlines by default. To quote the Python documentation:

newline controls how line endings are handled. It can be None, '', 'n', 'r', and 'rn'. It works as follows:

When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in 'n', 'r', or 'rn', and these are translated into 'n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.

When writing output to the stream, if newline is None, any 'n' characters written are translated to the system default line separator, os.linesep. If newline is '' or 'n', no translation takes place. If newline is any of the other legal values, any 'n' characters written are translated to the given string.

answered Jan 28 at 10:36

Gareth Rees

41.1k394168

Great point! I read up on binary mode but was not sure what the intended use for it was. Thank you for clarifying.
â€“Â nyvokub
Jan 28 at 15:01

add a commentÂ |Â

up vote
3
down vote

The code in the post does not work, because the files are opened in text mode, and in text mode Python 3 translates newlines by default. To quote the Python documentation:

newline controls how line endings are handled. It can be None, '', 'n', 'r', and 'rn'. It works as follows:

When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in 'n', 'r', or 'rn', and these are translated into 'n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.

When writing output to the stream, if newline is None, any 'n' characters written are translated to the system default line separator, os.linesep. If newline is '' or 'n', no translation takes place. If newline is any of the other legal values, any 'n' characters written are translated to the given string.

answered Jan 28 at 10:36

Gareth Rees

41.1k394168

The code in the post does not work, because the files are opened in text mode, and in text mode Python 3 translates newlines by default. To quote the Python documentation:

newline controls how line endings are handled. It can be None, '', 'n', 'r', and 'rn'. It works as follows:

When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in 'n', 'r', or 'rn', and these are translated into 'n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.

When writing output to the stream, if newline is None, any 'n' characters written are translated to the system default line separator, os.linesep. If newline is '' or 'n', no translation takes place. If newline is any of the other legal values, any 'n' characters written are translated to the given string.

answered Jan 28 at 10:36

Gareth Rees

41.1k394168

answered Jan 28 at 10:36

Gareth Rees

41.1k394168

answered Jan 28 at 10:36

Gareth Rees

41.1k394168

answered Jan 28 at 10:36

Gareth Rees

41.1k394168

Great point! I read up on binary mode but was not sure what the intended use for it was. Thank you for clarifying.
â€“Â nyvokub
Jan 28 at 15:01

add a commentÂ |Â

Great point! I read up on binary mode but was not sure what the intended use for it was. Thank you for clarifying.
â€“Â nyvokub
Jan 28 at 15:01

Great point! I read up on binary mode but was not sure what the intended use for it was. Thank you for clarifying.
â€“Â nyvokub
Jan 28 at 15:01

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr