Segment road map data with heuristics

Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
2
down vote
favorite
Backstory
A little while ago I was assigned with the task of clustering some road segments using open street map data. After parsing a huge XML (maybe that should be another review) I ended up with the following data format:
struct node
std::string id;
double lat;
double lon;
int refs;
node();
void print();
;
struct road
std::string id;
std::string type;
std::vector<int> nodes;
;
So a node has the 2 coordinates plus an id and a ref number, because later I'm supposed to keep track when a node is part of 2 different roads. A road has an id and a type and the vector has the indexes of its nodes in a vector<node> I'm using to keep the nodes. Also this vector doesn't change in size, so it's safe to keep indexes like that.
Segmenting
My actual task is to make segments out of these roads according to the following criteria:
- When the curvature of the segment's circumcircle exceeds a heuristic value
- When the size of a segment exceeds a heuristic value
- When a node is also part of another road; in this case the segment ends after all the nodes that are part of other roads.
The output should be a text file with the following format:
segment_id, way_id, no_of_nodes, node1_lat, node_1_lon, .... lastnode_lat, lastnode_lon
My solution is this
inline double curvature(double l1, double l2, double l3)
return l1*l2*l3/sqrt((l1+l2+l3)*(l2+l3-l1)*(l3+l1-l2)*(l1+l2-l3));
void write_segment(ofstream &out, int &id, string way_id, vector<double> &coords,
int &minsize, int &maxsize, int &tot_n)
out << id << ", " << way_id << ", " << coords.size()/2 << ", ";
for(size_t i=0; i<coords.size()-1; i++)
out << coords[i] << ", ";
out << coords.back() << 'n';
//check if we have new max/min
if(coords.size()/2 < minsize)
minsize = coords.size()/2;
else if(coords.size()/2 > maxsize)
maxsize = coords.size()/2;
//ingrement the id
id++;
tot_n += coords.size()/2;
//keep last node because it's also a part of the next segment
coords.erase(coords.begin(), coords.end()-2);
void make_segments(const vector<road> &roads, const vector<node> &nodes,
const string &out_s)
ofstream out(out_s);
bool writefalse;
//lengths for every 3 points, total and temporary curvatures
double l1, l2, l3, curb, curv;
double thrs0.03; //curvature threshold over which a segment breaks
int segid; //serial segment id for every new segment
//count for average curvature, tot_n -> nodes written
int count, tot_n, totalnodes;
//min and max segment sizes for the statistics
int maxsize, minsizestd::numeric_limits<int>::max();
size_t nthrs200, minthrs4; //min and max threshold for segment size
vector<double> coords; //keep all the coordinates until written to file
cout << "Segmenting:n";
for(size_t i=0; i<roads.size(); i++)
totalnodes += roads[i].nodes.size();
coords.push_back(nodes[roads[i].nodes[0]].lat);
coords.push_back(nodes[roads[i].nodes[0]].lon);
//special case for roads with one node
if(roads[i].nodes.size() ==1)
out << segid << ", " << roads[i].id << ", " << 1 << ", ";
out << coords[0] << ", " << coords[1] << 'n';
segid++;
tot_n++;
coords.clear();
continue;
//make one segment for small roads (according to minthrs)
if(roads[i].nodes.size() <= minthrs)
for(size_t j=1; j<roads[i].nodes.size(); j++)
coords.push_back(nodes[roads[i].nodes[j]].lat);
coords.push_back(nodes[roads[i].nodes[j]].lon);
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
continue;
//for every other road segment it according to heuristics
for(size_t j=1; j<roads[i].nodes.size()-minthrs; j++)
for(size_t j=roads[i].nodes.size()-minthrs; j<roads[i].nodes.size(); j++)
coords.push_back(nodes[roads[i].nodes[j]].lat);
coords.push_back(nodes[roads[i].nodes[j]].lon);
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
coords.clear();
cout << "Maxsize: " << maxsize << 'n';
cout << "Minsize: " << minsize << 'n';
cout << "Segs: " << segid << 'n';
cout << "Nodes written: " << tot_n << 'n';
cout << "Total nodes were: " << totalnodes << 'n';
cout << "Average curvature: " << double(curv/count) << endl;
I'm interested in good and bad practices this code follows, readability, especially when documenting what each variable stands for and any performance issues.
c++ c++11 computational-geometry
add a comment |Â
up vote
2
down vote
favorite
Backstory
A little while ago I was assigned with the task of clustering some road segments using open street map data. After parsing a huge XML (maybe that should be another review) I ended up with the following data format:
struct node
std::string id;
double lat;
double lon;
int refs;
node();
void print();
;
struct road
std::string id;
std::string type;
std::vector<int> nodes;
;
So a node has the 2 coordinates plus an id and a ref number, because later I'm supposed to keep track when a node is part of 2 different roads. A road has an id and a type and the vector has the indexes of its nodes in a vector<node> I'm using to keep the nodes. Also this vector doesn't change in size, so it's safe to keep indexes like that.
Segmenting
My actual task is to make segments out of these roads according to the following criteria:
- When the curvature of the segment's circumcircle exceeds a heuristic value
- When the size of a segment exceeds a heuristic value
- When a node is also part of another road; in this case the segment ends after all the nodes that are part of other roads.
The output should be a text file with the following format:
segment_id, way_id, no_of_nodes, node1_lat, node_1_lon, .... lastnode_lat, lastnode_lon
My solution is this
inline double curvature(double l1, double l2, double l3)
return l1*l2*l3/sqrt((l1+l2+l3)*(l2+l3-l1)*(l3+l1-l2)*(l1+l2-l3));
void write_segment(ofstream &out, int &id, string way_id, vector<double> &coords,
int &minsize, int &maxsize, int &tot_n)
out << id << ", " << way_id << ", " << coords.size()/2 << ", ";
for(size_t i=0; i<coords.size()-1; i++)
out << coords[i] << ", ";
out << coords.back() << 'n';
//check if we have new max/min
if(coords.size()/2 < minsize)
minsize = coords.size()/2;
else if(coords.size()/2 > maxsize)
maxsize = coords.size()/2;
//ingrement the id
id++;
tot_n += coords.size()/2;
//keep last node because it's also a part of the next segment
coords.erase(coords.begin(), coords.end()-2);
void make_segments(const vector<road> &roads, const vector<node> &nodes,
const string &out_s)
ofstream out(out_s);
bool writefalse;
//lengths for every 3 points, total and temporary curvatures
double l1, l2, l3, curb, curv;
double thrs0.03; //curvature threshold over which a segment breaks
int segid; //serial segment id for every new segment
//count for average curvature, tot_n -> nodes written
int count, tot_n, totalnodes;
//min and max segment sizes for the statistics
int maxsize, minsizestd::numeric_limits<int>::max();
size_t nthrs200, minthrs4; //min and max threshold for segment size
vector<double> coords; //keep all the coordinates until written to file
cout << "Segmenting:n";
for(size_t i=0; i<roads.size(); i++)
totalnodes += roads[i].nodes.size();
coords.push_back(nodes[roads[i].nodes[0]].lat);
coords.push_back(nodes[roads[i].nodes[0]].lon);
//special case for roads with one node
if(roads[i].nodes.size() ==1)
out << segid << ", " << roads[i].id << ", " << 1 << ", ";
out << coords[0] << ", " << coords[1] << 'n';
segid++;
tot_n++;
coords.clear();
continue;
//make one segment for small roads (according to minthrs)
if(roads[i].nodes.size() <= minthrs)
for(size_t j=1; j<roads[i].nodes.size(); j++)
coords.push_back(nodes[roads[i].nodes[j]].lat);
coords.push_back(nodes[roads[i].nodes[j]].lon);
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
continue;
//for every other road segment it according to heuristics
for(size_t j=1; j<roads[i].nodes.size()-minthrs; j++)
for(size_t j=roads[i].nodes.size()-minthrs; j<roads[i].nodes.size(); j++)
coords.push_back(nodes[roads[i].nodes[j]].lat);
coords.push_back(nodes[roads[i].nodes[j]].lon);
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
coords.clear();
cout << "Maxsize: " << maxsize << 'n';
cout << "Minsize: " << minsize << 'n';
cout << "Segs: " << segid << 'n';
cout << "Nodes written: " << tot_n << 'n';
cout << "Total nodes were: " << totalnodes << 'n';
cout << "Average curvature: " << double(curv/count) << endl;
I'm interested in good and bad practices this code follows, readability, especially when documenting what each variable stands for and any performance issues.
c++ c++11 computational-geometry
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Backstory
A little while ago I was assigned with the task of clustering some road segments using open street map data. After parsing a huge XML (maybe that should be another review) I ended up with the following data format:
struct node
std::string id;
double lat;
double lon;
int refs;
node();
void print();
;
struct road
std::string id;
std::string type;
std::vector<int> nodes;
;
So a node has the 2 coordinates plus an id and a ref number, because later I'm supposed to keep track when a node is part of 2 different roads. A road has an id and a type and the vector has the indexes of its nodes in a vector<node> I'm using to keep the nodes. Also this vector doesn't change in size, so it's safe to keep indexes like that.
Segmenting
My actual task is to make segments out of these roads according to the following criteria:
- When the curvature of the segment's circumcircle exceeds a heuristic value
- When the size of a segment exceeds a heuristic value
- When a node is also part of another road; in this case the segment ends after all the nodes that are part of other roads.
The output should be a text file with the following format:
segment_id, way_id, no_of_nodes, node1_lat, node_1_lon, .... lastnode_lat, lastnode_lon
My solution is this
inline double curvature(double l1, double l2, double l3)
return l1*l2*l3/sqrt((l1+l2+l3)*(l2+l3-l1)*(l3+l1-l2)*(l1+l2-l3));
void write_segment(ofstream &out, int &id, string way_id, vector<double> &coords,
int &minsize, int &maxsize, int &tot_n)
out << id << ", " << way_id << ", " << coords.size()/2 << ", ";
for(size_t i=0; i<coords.size()-1; i++)
out << coords[i] << ", ";
out << coords.back() << 'n';
//check if we have new max/min
if(coords.size()/2 < minsize)
minsize = coords.size()/2;
else if(coords.size()/2 > maxsize)
maxsize = coords.size()/2;
//ingrement the id
id++;
tot_n += coords.size()/2;
//keep last node because it's also a part of the next segment
coords.erase(coords.begin(), coords.end()-2);
void make_segments(const vector<road> &roads, const vector<node> &nodes,
const string &out_s)
ofstream out(out_s);
bool writefalse;
//lengths for every 3 points, total and temporary curvatures
double l1, l2, l3, curb, curv;
double thrs0.03; //curvature threshold over which a segment breaks
int segid; //serial segment id for every new segment
//count for average curvature, tot_n -> nodes written
int count, tot_n, totalnodes;
//min and max segment sizes for the statistics
int maxsize, minsizestd::numeric_limits<int>::max();
size_t nthrs200, minthrs4; //min and max threshold for segment size
vector<double> coords; //keep all the coordinates until written to file
cout << "Segmenting:n";
for(size_t i=0; i<roads.size(); i++)
totalnodes += roads[i].nodes.size();
coords.push_back(nodes[roads[i].nodes[0]].lat);
coords.push_back(nodes[roads[i].nodes[0]].lon);
//special case for roads with one node
if(roads[i].nodes.size() ==1)
out << segid << ", " << roads[i].id << ", " << 1 << ", ";
out << coords[0] << ", " << coords[1] << 'n';
segid++;
tot_n++;
coords.clear();
continue;
//make one segment for small roads (according to minthrs)
if(roads[i].nodes.size() <= minthrs)
for(size_t j=1; j<roads[i].nodes.size(); j++)
coords.push_back(nodes[roads[i].nodes[j]].lat);
coords.push_back(nodes[roads[i].nodes[j]].lon);
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
continue;
//for every other road segment it according to heuristics
for(size_t j=1; j<roads[i].nodes.size()-minthrs; j++)
for(size_t j=roads[i].nodes.size()-minthrs; j<roads[i].nodes.size(); j++)
coords.push_back(nodes[roads[i].nodes[j]].lat);
coords.push_back(nodes[roads[i].nodes[j]].lon);
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
coords.clear();
cout << "Maxsize: " << maxsize << 'n';
cout << "Minsize: " << minsize << 'n';
cout << "Segs: " << segid << 'n';
cout << "Nodes written: " << tot_n << 'n';
cout << "Total nodes were: " << totalnodes << 'n';
cout << "Average curvature: " << double(curv/count) << endl;
I'm interested in good and bad practices this code follows, readability, especially when documenting what each variable stands for and any performance issues.
c++ c++11 computational-geometry
Backstory
A little while ago I was assigned with the task of clustering some road segments using open street map data. After parsing a huge XML (maybe that should be another review) I ended up with the following data format:
struct node
std::string id;
double lat;
double lon;
int refs;
node();
void print();
;
struct road
std::string id;
std::string type;
std::vector<int> nodes;
;
So a node has the 2 coordinates plus an id and a ref number, because later I'm supposed to keep track when a node is part of 2 different roads. A road has an id and a type and the vector has the indexes of its nodes in a vector<node> I'm using to keep the nodes. Also this vector doesn't change in size, so it's safe to keep indexes like that.
Segmenting
My actual task is to make segments out of these roads according to the following criteria:
- When the curvature of the segment's circumcircle exceeds a heuristic value
- When the size of a segment exceeds a heuristic value
- When a node is also part of another road; in this case the segment ends after all the nodes that are part of other roads.
The output should be a text file with the following format:
segment_id, way_id, no_of_nodes, node1_lat, node_1_lon, .... lastnode_lat, lastnode_lon
My solution is this
inline double curvature(double l1, double l2, double l3)
return l1*l2*l3/sqrt((l1+l2+l3)*(l2+l3-l1)*(l3+l1-l2)*(l1+l2-l3));
void write_segment(ofstream &out, int &id, string way_id, vector<double> &coords,
int &minsize, int &maxsize, int &tot_n)
out << id << ", " << way_id << ", " << coords.size()/2 << ", ";
for(size_t i=0; i<coords.size()-1; i++)
out << coords[i] << ", ";
out << coords.back() << 'n';
//check if we have new max/min
if(coords.size()/2 < minsize)
minsize = coords.size()/2;
else if(coords.size()/2 > maxsize)
maxsize = coords.size()/2;
//ingrement the id
id++;
tot_n += coords.size()/2;
//keep last node because it's also a part of the next segment
coords.erase(coords.begin(), coords.end()-2);
void make_segments(const vector<road> &roads, const vector<node> &nodes,
const string &out_s)
ofstream out(out_s);
bool writefalse;
//lengths for every 3 points, total and temporary curvatures
double l1, l2, l3, curb, curv;
double thrs0.03; //curvature threshold over which a segment breaks
int segid; //serial segment id for every new segment
//count for average curvature, tot_n -> nodes written
int count, tot_n, totalnodes;
//min and max segment sizes for the statistics
int maxsize, minsizestd::numeric_limits<int>::max();
size_t nthrs200, minthrs4; //min and max threshold for segment size
vector<double> coords; //keep all the coordinates until written to file
cout << "Segmenting:n";
for(size_t i=0; i<roads.size(); i++)
totalnodes += roads[i].nodes.size();
coords.push_back(nodes[roads[i].nodes[0]].lat);
coords.push_back(nodes[roads[i].nodes[0]].lon);
//special case for roads with one node
if(roads[i].nodes.size() ==1)
out << segid << ", " << roads[i].id << ", " << 1 << ", ";
out << coords[0] << ", " << coords[1] << 'n';
segid++;
tot_n++;
coords.clear();
continue;
//make one segment for small roads (according to minthrs)
if(roads[i].nodes.size() <= minthrs)
for(size_t j=1; j<roads[i].nodes.size(); j++)
coords.push_back(nodes[roads[i].nodes[j]].lat);
coords.push_back(nodes[roads[i].nodes[j]].lon);
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
continue;
//for every other road segment it according to heuristics
for(size_t j=1; j<roads[i].nodes.size()-minthrs; j++)
for(size_t j=roads[i].nodes.size()-minthrs; j<roads[i].nodes.size(); j++)
coords.push_back(nodes[roads[i].nodes[j]].lat);
coords.push_back(nodes[roads[i].nodes[j]].lon);
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
coords.clear();
cout << "Maxsize: " << maxsize << 'n';
cout << "Minsize: " << minsize << 'n';
cout << "Segs: " << segid << 'n';
cout << "Nodes written: " << tot_n << 'n';
cout << "Total nodes were: " << totalnodes << 'n';
cout << "Average curvature: " << double(curv/count) << endl;
I'm interested in good and bad practices this code follows, readability, especially when documenting what each variable stands for and any performance issues.
c++ c++11 computational-geometry
edited Jan 9 at 2:33
Jamalâ¦
30.1k11114225
30.1k11114225
asked Jan 8 at 14:56
Konstantinoscs
2511
2511
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
Prefer Clear Naming over using namespace std
According to the MSDN website:
Namespaces are used to organize code into logical groups and to prevent name collisions that can occur especially when your code base includes multiple libraries.
A collision is when 2 different functions have the same name, the same argument types and a similar functionality (this is why they have the same name). Someone developing software may want to override a function such as std::cout, std::cin or they may want to override the functionality of a class such as std::vector or std::stack. Namespaces allow these constructs to be overridden.
The use of the programming statement:
using namespace std;
hides the fact that cin, cout, vector and stack are coming from the namespace std where cin, cout,
vector and stack are used in the code. This can cause confusion of where the code is actually
coming from.
As the software becomes more complex and uses more libraries this becomes a bigger problem.
For a more detailed discussion of why it is a bad idea to use using namespace std; see this stackoverflow question and stackoverflow question.
Reduce Complexity, Follow SRP
The Single Responsibility Principle states that every module or class should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility.
Robert C. Martin expresses the principle as follows:
A class should have only one reason to change.
While this is primarily targeted at classes in object oriented languages it applies to functions and subroutines well.
The void make_segments(const std::vector<road> &roads, const std::vector<node> &nodes, const std::string &out_s) function could be broken up into at multiple functions. The following code is a good candidate for a function to be called by make_segments:
//for every other road segment it according to heuristics
for (size_t j = 1; j<roads[i].nodes.size() - minthrs; j++) write)
//check for curvature or maxsize
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
write = false;
The more separate functions there are the easier it is to understand or read the code. This also makes it easier for any programmer to maintain or debug the code.
Prefer Iterators Over Indexes for Container Classes
The function void write_segment(std::ofstream &out, int &id, std::string way_id, std::vector<double> &coords, int &minsize, int &maxsize, int &tot_n) contains the following code:
for (size_t i = 0; i<coords.size() - 1; i++)
out << coords[i] << ", ";
A more modern and less C programming language like construct would be:
for (auto coord_iterator: coords)
out << coord_iterator << ",";
Using iterators in this manner makes it much easier to program correctly.
Declare Variables as Close to Use as Possible
In the loop defined as a possible candidate for a function there are 3 variables that should be defined within the loop rather than at the top of the function. The variables l1, l2 and l3 are only assigned values and used within the loop.
//find the 3 sides of the triangle for every 3 points
double l1 = euclid_dist(nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon,
nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon);
double l2 = euclid_dist(nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon,
nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon);
double l3 = euclid_dist(nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon,
nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon);
//compute circumcircle radius (curvature)
curb = curvature(l1, l2, l3);
count++;
1
Yourfor autoloop writes an extra comma at the end, which the original code does not.
â Roland Illig
Jan 9 at 0:57
@RolandIllig true.
â pacmaninbw
Jan 9 at 1:18
I thought about declaring the l1/2/3 variables inside the for loop but isn't the constant memory allocation bad for performance? Putting std:: I think it's going to be very hard for me but I read about its advantages.
â Konstantinoscs
Jan 9 at 2:58
@Konstantinoscs If you are using an optimizing compiler at the O3 level performance shouldn't be affected by these declarations. You actually have a nice little function there called find_the_triangle if you want it, if curb = find_the_triangle().
â pacmaninbw
Jan 9 at 3:23
Can you elaborate on the optimization a bit? Are there rules of thumb about what to not care about if you put -O?
â Konstantinoscs
Jan 9 at 3:42
 |Â
show 1 more comment
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
Prefer Clear Naming over using namespace std
According to the MSDN website:
Namespaces are used to organize code into logical groups and to prevent name collisions that can occur especially when your code base includes multiple libraries.
A collision is when 2 different functions have the same name, the same argument types and a similar functionality (this is why they have the same name). Someone developing software may want to override a function such as std::cout, std::cin or they may want to override the functionality of a class such as std::vector or std::stack. Namespaces allow these constructs to be overridden.
The use of the programming statement:
using namespace std;
hides the fact that cin, cout, vector and stack are coming from the namespace std where cin, cout,
vector and stack are used in the code. This can cause confusion of where the code is actually
coming from.
As the software becomes more complex and uses more libraries this becomes a bigger problem.
For a more detailed discussion of why it is a bad idea to use using namespace std; see this stackoverflow question and stackoverflow question.
Reduce Complexity, Follow SRP
The Single Responsibility Principle states that every module or class should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility.
Robert C. Martin expresses the principle as follows:
A class should have only one reason to change.
While this is primarily targeted at classes in object oriented languages it applies to functions and subroutines well.
The void make_segments(const std::vector<road> &roads, const std::vector<node> &nodes, const std::string &out_s) function could be broken up into at multiple functions. The following code is a good candidate for a function to be called by make_segments:
//for every other road segment it according to heuristics
for (size_t j = 1; j<roads[i].nodes.size() - minthrs; j++) write)
//check for curvature or maxsize
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
write = false;
The more separate functions there are the easier it is to understand or read the code. This also makes it easier for any programmer to maintain or debug the code.
Prefer Iterators Over Indexes for Container Classes
The function void write_segment(std::ofstream &out, int &id, std::string way_id, std::vector<double> &coords, int &minsize, int &maxsize, int &tot_n) contains the following code:
for (size_t i = 0; i<coords.size() - 1; i++)
out << coords[i] << ", ";
A more modern and less C programming language like construct would be:
for (auto coord_iterator: coords)
out << coord_iterator << ",";
Using iterators in this manner makes it much easier to program correctly.
Declare Variables as Close to Use as Possible
In the loop defined as a possible candidate for a function there are 3 variables that should be defined within the loop rather than at the top of the function. The variables l1, l2 and l3 are only assigned values and used within the loop.
//find the 3 sides of the triangle for every 3 points
double l1 = euclid_dist(nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon,
nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon);
double l2 = euclid_dist(nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon,
nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon);
double l3 = euclid_dist(nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon,
nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon);
//compute circumcircle radius (curvature)
curb = curvature(l1, l2, l3);
count++;
1
Yourfor autoloop writes an extra comma at the end, which the original code does not.
â Roland Illig
Jan 9 at 0:57
@RolandIllig true.
â pacmaninbw
Jan 9 at 1:18
I thought about declaring the l1/2/3 variables inside the for loop but isn't the constant memory allocation bad for performance? Putting std:: I think it's going to be very hard for me but I read about its advantages.
â Konstantinoscs
Jan 9 at 2:58
@Konstantinoscs If you are using an optimizing compiler at the O3 level performance shouldn't be affected by these declarations. You actually have a nice little function there called find_the_triangle if you want it, if curb = find_the_triangle().
â pacmaninbw
Jan 9 at 3:23
Can you elaborate on the optimization a bit? Are there rules of thumb about what to not care about if you put -O?
â Konstantinoscs
Jan 9 at 3:42
 |Â
show 1 more comment
up vote
1
down vote
accepted
Prefer Clear Naming over using namespace std
According to the MSDN website:
Namespaces are used to organize code into logical groups and to prevent name collisions that can occur especially when your code base includes multiple libraries.
A collision is when 2 different functions have the same name, the same argument types and a similar functionality (this is why they have the same name). Someone developing software may want to override a function such as std::cout, std::cin or they may want to override the functionality of a class such as std::vector or std::stack. Namespaces allow these constructs to be overridden.
The use of the programming statement:
using namespace std;
hides the fact that cin, cout, vector and stack are coming from the namespace std where cin, cout,
vector and stack are used in the code. This can cause confusion of where the code is actually
coming from.
As the software becomes more complex and uses more libraries this becomes a bigger problem.
For a more detailed discussion of why it is a bad idea to use using namespace std; see this stackoverflow question and stackoverflow question.
Reduce Complexity, Follow SRP
The Single Responsibility Principle states that every module or class should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility.
Robert C. Martin expresses the principle as follows:
A class should have only one reason to change.
While this is primarily targeted at classes in object oriented languages it applies to functions and subroutines well.
The void make_segments(const std::vector<road> &roads, const std::vector<node> &nodes, const std::string &out_s) function could be broken up into at multiple functions. The following code is a good candidate for a function to be called by make_segments:
//for every other road segment it according to heuristics
for (size_t j = 1; j<roads[i].nodes.size() - minthrs; j++) write)
//check for curvature or maxsize
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
write = false;
The more separate functions there are the easier it is to understand or read the code. This also makes it easier for any programmer to maintain or debug the code.
Prefer Iterators Over Indexes for Container Classes
The function void write_segment(std::ofstream &out, int &id, std::string way_id, std::vector<double> &coords, int &minsize, int &maxsize, int &tot_n) contains the following code:
for (size_t i = 0; i<coords.size() - 1; i++)
out << coords[i] << ", ";
A more modern and less C programming language like construct would be:
for (auto coord_iterator: coords)
out << coord_iterator << ",";
Using iterators in this manner makes it much easier to program correctly.
Declare Variables as Close to Use as Possible
In the loop defined as a possible candidate for a function there are 3 variables that should be defined within the loop rather than at the top of the function. The variables l1, l2 and l3 are only assigned values and used within the loop.
//find the 3 sides of the triangle for every 3 points
double l1 = euclid_dist(nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon,
nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon);
double l2 = euclid_dist(nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon,
nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon);
double l3 = euclid_dist(nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon,
nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon);
//compute circumcircle radius (curvature)
curb = curvature(l1, l2, l3);
count++;
1
Yourfor autoloop writes an extra comma at the end, which the original code does not.
â Roland Illig
Jan 9 at 0:57
@RolandIllig true.
â pacmaninbw
Jan 9 at 1:18
I thought about declaring the l1/2/3 variables inside the for loop but isn't the constant memory allocation bad for performance? Putting std:: I think it's going to be very hard for me but I read about its advantages.
â Konstantinoscs
Jan 9 at 2:58
@Konstantinoscs If you are using an optimizing compiler at the O3 level performance shouldn't be affected by these declarations. You actually have a nice little function there called find_the_triangle if you want it, if curb = find_the_triangle().
â pacmaninbw
Jan 9 at 3:23
Can you elaborate on the optimization a bit? Are there rules of thumb about what to not care about if you put -O?
â Konstantinoscs
Jan 9 at 3:42
 |Â
show 1 more comment
up vote
1
down vote
accepted
up vote
1
down vote
accepted
Prefer Clear Naming over using namespace std
According to the MSDN website:
Namespaces are used to organize code into logical groups and to prevent name collisions that can occur especially when your code base includes multiple libraries.
A collision is when 2 different functions have the same name, the same argument types and a similar functionality (this is why they have the same name). Someone developing software may want to override a function such as std::cout, std::cin or they may want to override the functionality of a class such as std::vector or std::stack. Namespaces allow these constructs to be overridden.
The use of the programming statement:
using namespace std;
hides the fact that cin, cout, vector and stack are coming from the namespace std where cin, cout,
vector and stack are used in the code. This can cause confusion of where the code is actually
coming from.
As the software becomes more complex and uses more libraries this becomes a bigger problem.
For a more detailed discussion of why it is a bad idea to use using namespace std; see this stackoverflow question and stackoverflow question.
Reduce Complexity, Follow SRP
The Single Responsibility Principle states that every module or class should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility.
Robert C. Martin expresses the principle as follows:
A class should have only one reason to change.
While this is primarily targeted at classes in object oriented languages it applies to functions and subroutines well.
The void make_segments(const std::vector<road> &roads, const std::vector<node> &nodes, const std::string &out_s) function could be broken up into at multiple functions. The following code is a good candidate for a function to be called by make_segments:
//for every other road segment it according to heuristics
for (size_t j = 1; j<roads[i].nodes.size() - minthrs; j++) write)
//check for curvature or maxsize
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
write = false;
The more separate functions there are the easier it is to understand or read the code. This also makes it easier for any programmer to maintain or debug the code.
Prefer Iterators Over Indexes for Container Classes
The function void write_segment(std::ofstream &out, int &id, std::string way_id, std::vector<double> &coords, int &minsize, int &maxsize, int &tot_n) contains the following code:
for (size_t i = 0; i<coords.size() - 1; i++)
out << coords[i] << ", ";
A more modern and less C programming language like construct would be:
for (auto coord_iterator: coords)
out << coord_iterator << ",";
Using iterators in this manner makes it much easier to program correctly.
Declare Variables as Close to Use as Possible
In the loop defined as a possible candidate for a function there are 3 variables that should be defined within the loop rather than at the top of the function. The variables l1, l2 and l3 are only assigned values and used within the loop.
//find the 3 sides of the triangle for every 3 points
double l1 = euclid_dist(nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon,
nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon);
double l2 = euclid_dist(nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon,
nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon);
double l3 = euclid_dist(nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon,
nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon);
//compute circumcircle radius (curvature)
curb = curvature(l1, l2, l3);
count++;
Prefer Clear Naming over using namespace std
According to the MSDN website:
Namespaces are used to organize code into logical groups and to prevent name collisions that can occur especially when your code base includes multiple libraries.
A collision is when 2 different functions have the same name, the same argument types and a similar functionality (this is why they have the same name). Someone developing software may want to override a function such as std::cout, std::cin or they may want to override the functionality of a class such as std::vector or std::stack. Namespaces allow these constructs to be overridden.
The use of the programming statement:
using namespace std;
hides the fact that cin, cout, vector and stack are coming from the namespace std where cin, cout,
vector and stack are used in the code. This can cause confusion of where the code is actually
coming from.
As the software becomes more complex and uses more libraries this becomes a bigger problem.
For a more detailed discussion of why it is a bad idea to use using namespace std; see this stackoverflow question and stackoverflow question.
Reduce Complexity, Follow SRP
The Single Responsibility Principle states that every module or class should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility.
Robert C. Martin expresses the principle as follows:
A class should have only one reason to change.
While this is primarily targeted at classes in object oriented languages it applies to functions and subroutines well.
The void make_segments(const std::vector<road> &roads, const std::vector<node> &nodes, const std::string &out_s) function could be broken up into at multiple functions. The following code is a good candidate for a function to be called by make_segments:
//for every other road segment it according to heuristics
for (size_t j = 1; j<roads[i].nodes.size() - minthrs; j++) write)
//check for curvature or maxsize
write_segment(out, segid, roads[i].id, coords, minsize, maxsize, tot_n);
write = false;
The more separate functions there are the easier it is to understand or read the code. This also makes it easier for any programmer to maintain or debug the code.
Prefer Iterators Over Indexes for Container Classes
The function void write_segment(std::ofstream &out, int &id, std::string way_id, std::vector<double> &coords, int &minsize, int &maxsize, int &tot_n) contains the following code:
for (size_t i = 0; i<coords.size() - 1; i++)
out << coords[i] << ", ";
A more modern and less C programming language like construct would be:
for (auto coord_iterator: coords)
out << coord_iterator << ",";
Using iterators in this manner makes it much easier to program correctly.
Declare Variables as Close to Use as Possible
In the loop defined as a possible candidate for a function there are 3 variables that should be defined within the loop rather than at the top of the function. The variables l1, l2 and l3 are only assigned values and used within the loop.
//find the 3 sides of the triangle for every 3 points
double l1 = euclid_dist(nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon,
nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon);
double l2 = euclid_dist(nodes[roads[i].nodes[j]].lat, nodes[roads[i].nodes[j]].lon,
nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon);
double l3 = euclid_dist(nodes[roads[i].nodes[j + 1]].lat, nodes[roads[i].nodes[j + 1]].lon,
nodes[roads[i].nodes[j - 1]].lat, nodes[roads[i].nodes[j - 1]].lon);
//compute circumcircle radius (curvature)
curb = curvature(l1, l2, l3);
count++;
answered Jan 9 at 0:17
pacmaninbw
4,99321436
4,99321436
1
Yourfor autoloop writes an extra comma at the end, which the original code does not.
â Roland Illig
Jan 9 at 0:57
@RolandIllig true.
â pacmaninbw
Jan 9 at 1:18
I thought about declaring the l1/2/3 variables inside the for loop but isn't the constant memory allocation bad for performance? Putting std:: I think it's going to be very hard for me but I read about its advantages.
â Konstantinoscs
Jan 9 at 2:58
@Konstantinoscs If you are using an optimizing compiler at the O3 level performance shouldn't be affected by these declarations. You actually have a nice little function there called find_the_triangle if you want it, if curb = find_the_triangle().
â pacmaninbw
Jan 9 at 3:23
Can you elaborate on the optimization a bit? Are there rules of thumb about what to not care about if you put -O?
â Konstantinoscs
Jan 9 at 3:42
 |Â
show 1 more comment
1
Yourfor autoloop writes an extra comma at the end, which the original code does not.
â Roland Illig
Jan 9 at 0:57
@RolandIllig true.
â pacmaninbw
Jan 9 at 1:18
I thought about declaring the l1/2/3 variables inside the for loop but isn't the constant memory allocation bad for performance? Putting std:: I think it's going to be very hard for me but I read about its advantages.
â Konstantinoscs
Jan 9 at 2:58
@Konstantinoscs If you are using an optimizing compiler at the O3 level performance shouldn't be affected by these declarations. You actually have a nice little function there called find_the_triangle if you want it, if curb = find_the_triangle().
â pacmaninbw
Jan 9 at 3:23
Can you elaborate on the optimization a bit? Are there rules of thumb about what to not care about if you put -O?
â Konstantinoscs
Jan 9 at 3:42
1
1
Your
for auto loop writes an extra comma at the end, which the original code does not.â Roland Illig
Jan 9 at 0:57
Your
for auto loop writes an extra comma at the end, which the original code does not.â Roland Illig
Jan 9 at 0:57
@RolandIllig true.
â pacmaninbw
Jan 9 at 1:18
@RolandIllig true.
â pacmaninbw
Jan 9 at 1:18
I thought about declaring the l1/2/3 variables inside the for loop but isn't the constant memory allocation bad for performance? Putting std:: I think it's going to be very hard for me but I read about its advantages.
â Konstantinoscs
Jan 9 at 2:58
I thought about declaring the l1/2/3 variables inside the for loop but isn't the constant memory allocation bad for performance? Putting std:: I think it's going to be very hard for me but I read about its advantages.
â Konstantinoscs
Jan 9 at 2:58
@Konstantinoscs If you are using an optimizing compiler at the O3 level performance shouldn't be affected by these declarations. You actually have a nice little function there called find_the_triangle if you want it, if curb = find_the_triangle().
â pacmaninbw
Jan 9 at 3:23
@Konstantinoscs If you are using an optimizing compiler at the O3 level performance shouldn't be affected by these declarations. You actually have a nice little function there called find_the_triangle if you want it, if curb = find_the_triangle().
â pacmaninbw
Jan 9 at 3:23
Can you elaborate on the optimization a bit? Are there rules of thumb about what to not care about if you put -O?
â Konstantinoscs
Jan 9 at 3:42
Can you elaborate on the optimization a bit? Are there rules of thumb about what to not care about if you put -O?
â Konstantinoscs
Jan 9 at 3:42
 |Â
show 1 more comment
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f184588%2fsegment-road-map-data-with-heuristics%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password