Python program to transform an Excel document, with some parsing

Multi tool use
Multi tool use

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
5
down vote

favorite












I wrote this to parse through an Excel doc, extract desired columns, parse certain cells and output another formatted Excel doc.



This is the original Excel doc I have condensed most of the columns for space



excelsnip



Here are the the results of the program that I am mostly happy with, it parses the name column and writes a unit identifier(i.e 83) along with a tower identifier(i.e 5) on a new line along with a completed date if the completed column is TRUE. I think the only thing I would change would be getting rid of the blank line under the row that has H 7



resultsnip



And here is the code



import xlrd
import xlwt
import re

# open and create workbook
workbookread = xlrd.open_workbook('seedtestexcelbytask.xls')
workbookwrite = xlwt.Workbook('output.xlsx')

# open and create sheets
worksheetread = workbookread.sheet_by_index(0)
worksheetwrite = workbookwrite.add_sheet('Sheet1')

timelist =
towerlist =

i = 0
l = 0
t = 0

# iterate through rows in worksheet, get columns completed, completed_at and name.
# add to timelist and towerlist if completed == true
for rows in range(worksheetread.nrows):
cell0 = worksheetread.cell(rows, 2).value
cell1 = worksheetread.cell(rows, 3).value
cell2 = worksheetread.cell(rows, 15).value

if cell0 == 1:
parsedvalue = re.split(', |/', cell2)
timelist.append(cell1)
towerlist.append(parsedvalue)

b = 2
g = 1
for rows in range(len(timelist)):
row = worksheetwrite.row(i)

# get current index from timelist and towerlist and assign to currenttime and currentvalue

currenttime = timelist[t]
currentname = towerlist[t]

# get the length of the current index this will help us decide how to parse
length = len(towerlist[t])

# if length <= 2 the index is not formatted properly so we only need index 1 which is the cafe and towers together
if length <= 2:
row.write(0, currenttime)
row.write(1, currentname[1])
i += 1
l += 1

# if length = 4 index is formatted properly and we just grab the columns we need
elif length == 4:
a = 2

for values in range(0, 2):
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentname[1])
row.write(2, currentname[a])
a += 1
i += 1
l += 1

# if length > 4 we need to iterate through each index and write them
else:

# define the length of the current cell of the excel sheet we have
lengthd = (len(currentname)+1)/2
length = int(lengthd)

for values in range(length):
# set variables that we will need to hold on to
currentcafe = currentname[g]
currenttower = currentname[b]

# if the value of the current tower is 2,4,5,6,7 we need to write the current cafe and current tower
# then move on to the next value and set that as the current cafe, we also need to jump to the
# value after the current cafe and set that as the current tower
if currenttower == "2" or currenttower == "4" or currenttower == "5" or currenttower == "6" or currenttower == "7":
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentcafe)
row.write(2, currenttower)

b += 1
i += 1
g = b
b = g + 1
# otherwise we can just write the value fo the current cafe and current tower, advance to the next value
# which is another tower and set current tower to that
else:
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentcafe)
row.write(2, currenttower)
b += 1
i += 1

l += 1
i += 1
t += 1
workbookwrite.save('output.xls')


This is the first time I have written anything with Python and I am fairly new to programming, any comments would be greatly appreciated.







share|improve this question

















  • 2




    Could you expand the column width to fully show the "notes" column? Also, could you summarize what this transformation is for?
    – 200_success
    Jun 7 at 20:55










  • I updated the the pic to better show relevant columns
    – azryel6500
    Jun 8 at 14:27











  • check out pandas
    – Maarten Fabré
    Jun 8 at 14:30
















up vote
5
down vote

favorite












I wrote this to parse through an Excel doc, extract desired columns, parse certain cells and output another formatted Excel doc.



This is the original Excel doc I have condensed most of the columns for space



excelsnip



Here are the the results of the program that I am mostly happy with, it parses the name column and writes a unit identifier(i.e 83) along with a tower identifier(i.e 5) on a new line along with a completed date if the completed column is TRUE. I think the only thing I would change would be getting rid of the blank line under the row that has H 7



resultsnip



And here is the code



import xlrd
import xlwt
import re

# open and create workbook
workbookread = xlrd.open_workbook('seedtestexcelbytask.xls')
workbookwrite = xlwt.Workbook('output.xlsx')

# open and create sheets
worksheetread = workbookread.sheet_by_index(0)
worksheetwrite = workbookwrite.add_sheet('Sheet1')

timelist =
towerlist =

i = 0
l = 0
t = 0

# iterate through rows in worksheet, get columns completed, completed_at and name.
# add to timelist and towerlist if completed == true
for rows in range(worksheetread.nrows):
cell0 = worksheetread.cell(rows, 2).value
cell1 = worksheetread.cell(rows, 3).value
cell2 = worksheetread.cell(rows, 15).value

if cell0 == 1:
parsedvalue = re.split(', |/', cell2)
timelist.append(cell1)
towerlist.append(parsedvalue)

b = 2
g = 1
for rows in range(len(timelist)):
row = worksheetwrite.row(i)

# get current index from timelist and towerlist and assign to currenttime and currentvalue

currenttime = timelist[t]
currentname = towerlist[t]

# get the length of the current index this will help us decide how to parse
length = len(towerlist[t])

# if length <= 2 the index is not formatted properly so we only need index 1 which is the cafe and towers together
if length <= 2:
row.write(0, currenttime)
row.write(1, currentname[1])
i += 1
l += 1

# if length = 4 index is formatted properly and we just grab the columns we need
elif length == 4:
a = 2

for values in range(0, 2):
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentname[1])
row.write(2, currentname[a])
a += 1
i += 1
l += 1

# if length > 4 we need to iterate through each index and write them
else:

# define the length of the current cell of the excel sheet we have
lengthd = (len(currentname)+1)/2
length = int(lengthd)

for values in range(length):
# set variables that we will need to hold on to
currentcafe = currentname[g]
currenttower = currentname[b]

# if the value of the current tower is 2,4,5,6,7 we need to write the current cafe and current tower
# then move on to the next value and set that as the current cafe, we also need to jump to the
# value after the current cafe and set that as the current tower
if currenttower == "2" or currenttower == "4" or currenttower == "5" or currenttower == "6" or currenttower == "7":
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentcafe)
row.write(2, currenttower)

b += 1
i += 1
g = b
b = g + 1
# otherwise we can just write the value fo the current cafe and current tower, advance to the next value
# which is another tower and set current tower to that
else:
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentcafe)
row.write(2, currenttower)
b += 1
i += 1

l += 1
i += 1
t += 1
workbookwrite.save('output.xls')


This is the first time I have written anything with Python and I am fairly new to programming, any comments would be greatly appreciated.







share|improve this question

















  • 2




    Could you expand the column width to fully show the "notes" column? Also, could you summarize what this transformation is for?
    – 200_success
    Jun 7 at 20:55










  • I updated the the pic to better show relevant columns
    – azryel6500
    Jun 8 at 14:27











  • check out pandas
    – Maarten Fabré
    Jun 8 at 14:30












up vote
5
down vote

favorite









up vote
5
down vote

favorite











I wrote this to parse through an Excel doc, extract desired columns, parse certain cells and output another formatted Excel doc.



This is the original Excel doc I have condensed most of the columns for space



excelsnip



Here are the the results of the program that I am mostly happy with, it parses the name column and writes a unit identifier(i.e 83) along with a tower identifier(i.e 5) on a new line along with a completed date if the completed column is TRUE. I think the only thing I would change would be getting rid of the blank line under the row that has H 7



resultsnip



And here is the code



import xlrd
import xlwt
import re

# open and create workbook
workbookread = xlrd.open_workbook('seedtestexcelbytask.xls')
workbookwrite = xlwt.Workbook('output.xlsx')

# open and create sheets
worksheetread = workbookread.sheet_by_index(0)
worksheetwrite = workbookwrite.add_sheet('Sheet1')

timelist =
towerlist =

i = 0
l = 0
t = 0

# iterate through rows in worksheet, get columns completed, completed_at and name.
# add to timelist and towerlist if completed == true
for rows in range(worksheetread.nrows):
cell0 = worksheetread.cell(rows, 2).value
cell1 = worksheetread.cell(rows, 3).value
cell2 = worksheetread.cell(rows, 15).value

if cell0 == 1:
parsedvalue = re.split(', |/', cell2)
timelist.append(cell1)
towerlist.append(parsedvalue)

b = 2
g = 1
for rows in range(len(timelist)):
row = worksheetwrite.row(i)

# get current index from timelist and towerlist and assign to currenttime and currentvalue

currenttime = timelist[t]
currentname = towerlist[t]

# get the length of the current index this will help us decide how to parse
length = len(towerlist[t])

# if length <= 2 the index is not formatted properly so we only need index 1 which is the cafe and towers together
if length <= 2:
row.write(0, currenttime)
row.write(1, currentname[1])
i += 1
l += 1

# if length = 4 index is formatted properly and we just grab the columns we need
elif length == 4:
a = 2

for values in range(0, 2):
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentname[1])
row.write(2, currentname[a])
a += 1
i += 1
l += 1

# if length > 4 we need to iterate through each index and write them
else:

# define the length of the current cell of the excel sheet we have
lengthd = (len(currentname)+1)/2
length = int(lengthd)

for values in range(length):
# set variables that we will need to hold on to
currentcafe = currentname[g]
currenttower = currentname[b]

# if the value of the current tower is 2,4,5,6,7 we need to write the current cafe and current tower
# then move on to the next value and set that as the current cafe, we also need to jump to the
# value after the current cafe and set that as the current tower
if currenttower == "2" or currenttower == "4" or currenttower == "5" or currenttower == "6" or currenttower == "7":
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentcafe)
row.write(2, currenttower)

b += 1
i += 1
g = b
b = g + 1
# otherwise we can just write the value fo the current cafe and current tower, advance to the next value
# which is another tower and set current tower to that
else:
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentcafe)
row.write(2, currenttower)
b += 1
i += 1

l += 1
i += 1
t += 1
workbookwrite.save('output.xls')


This is the first time I have written anything with Python and I am fairly new to programming, any comments would be greatly appreciated.







share|improve this question













I wrote this to parse through an Excel doc, extract desired columns, parse certain cells and output another formatted Excel doc.



This is the original Excel doc I have condensed most of the columns for space



excelsnip



Here are the the results of the program that I am mostly happy with, it parses the name column and writes a unit identifier(i.e 83) along with a tower identifier(i.e 5) on a new line along with a completed date if the completed column is TRUE. I think the only thing I would change would be getting rid of the blank line under the row that has H 7



resultsnip



And here is the code



import xlrd
import xlwt
import re

# open and create workbook
workbookread = xlrd.open_workbook('seedtestexcelbytask.xls')
workbookwrite = xlwt.Workbook('output.xlsx')

# open and create sheets
worksheetread = workbookread.sheet_by_index(0)
worksheetwrite = workbookwrite.add_sheet('Sheet1')

timelist =
towerlist =

i = 0
l = 0
t = 0

# iterate through rows in worksheet, get columns completed, completed_at and name.
# add to timelist and towerlist if completed == true
for rows in range(worksheetread.nrows):
cell0 = worksheetread.cell(rows, 2).value
cell1 = worksheetread.cell(rows, 3).value
cell2 = worksheetread.cell(rows, 15).value

if cell0 == 1:
parsedvalue = re.split(', |/', cell2)
timelist.append(cell1)
towerlist.append(parsedvalue)

b = 2
g = 1
for rows in range(len(timelist)):
row = worksheetwrite.row(i)

# get current index from timelist and towerlist and assign to currenttime and currentvalue

currenttime = timelist[t]
currentname = towerlist[t]

# get the length of the current index this will help us decide how to parse
length = len(towerlist[t])

# if length <= 2 the index is not formatted properly so we only need index 1 which is the cafe and towers together
if length <= 2:
row.write(0, currenttime)
row.write(1, currentname[1])
i += 1
l += 1

# if length = 4 index is formatted properly and we just grab the columns we need
elif length == 4:
a = 2

for values in range(0, 2):
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentname[1])
row.write(2, currentname[a])
a += 1
i += 1
l += 1

# if length > 4 we need to iterate through each index and write them
else:

# define the length of the current cell of the excel sheet we have
lengthd = (len(currentname)+1)/2
length = int(lengthd)

for values in range(length):
# set variables that we will need to hold on to
currentcafe = currentname[g]
currenttower = currentname[b]

# if the value of the current tower is 2,4,5,6,7 we need to write the current cafe and current tower
# then move on to the next value and set that as the current cafe, we also need to jump to the
# value after the current cafe and set that as the current tower
if currenttower == "2" or currenttower == "4" or currenttower == "5" or currenttower == "6" or currenttower == "7":
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentcafe)
row.write(2, currenttower)

b += 1
i += 1
g = b
b = g + 1
# otherwise we can just write the value fo the current cafe and current tower, advance to the next value
# which is another tower and set current tower to that
else:
row = worksheetwrite.row(l)
row.write(0, currenttime)
row.write(1, currentcafe)
row.write(2, currenttower)
b += 1
i += 1

l += 1
i += 1
t += 1
workbookwrite.save('output.xls')


This is the first time I have written anything with Python and I am fairly new to programming, any comments would be greatly appreciated.









share|improve this question












share|improve this question




share|improve this question








edited Jun 8 at 14:26
























asked Jun 7 at 20:13









azryel6500

263




263







  • 2




    Could you expand the column width to fully show the "notes" column? Also, could you summarize what this transformation is for?
    – 200_success
    Jun 7 at 20:55










  • I updated the the pic to better show relevant columns
    – azryel6500
    Jun 8 at 14:27











  • check out pandas
    – Maarten Fabré
    Jun 8 at 14:30












  • 2




    Could you expand the column width to fully show the "notes" column? Also, could you summarize what this transformation is for?
    – 200_success
    Jun 7 at 20:55










  • I updated the the pic to better show relevant columns
    – azryel6500
    Jun 8 at 14:27











  • check out pandas
    – Maarten Fabré
    Jun 8 at 14:30







2




2




Could you expand the column width to fully show the "notes" column? Also, could you summarize what this transformation is for?
– 200_success
Jun 7 at 20:55




Could you expand the column width to fully show the "notes" column? Also, could you summarize what this transformation is for?
– 200_success
Jun 7 at 20:55












I updated the the pic to better show relevant columns
– azryel6500
Jun 8 at 14:27





I updated the the pic to better show relevant columns
– azryel6500
Jun 8 at 14:27













check out pandas
– Maarten Fabré
Jun 8 at 14:30




check out pandas
– Maarten Fabré
Jun 8 at 14:30















active

oldest

votes











Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f196066%2fpython-program-to-transform-an-excel-document-with-some-parsing%23new-answer', 'question_page');

);

Post as a guest



































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes










 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f196066%2fpython-program-to-transform-an-excel-document-with-some-parsing%23new-answer', 'question_page');

);

Post as a guest













































































TvXiWO uqpa3aKLxXmg8hf238ZRxfsiMjUYLsEfOCn2TbiCR,5 B6YGgMzVAzIH2qod1Yhblw3X9T5lnwMdCsE
Aji8 nTzaafPXJIxolJGm1C,nywH1LR7 N2p1Xl R4cEZdLAwWY l Xb

Popular posts from this blog

Chat program with C++ and SFML

Function to Return a JSON Like Objects Using VBA Collections and Arrays

Read an image with ADNS2610 optical sensor and Arduino Uno