BA network generator and Percolator with OpenMP and also Linear code

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












Here I uploaded a very basic Barabasi-Albert network generator and then a percolator which conducts percolation over the network following some rules. I have used openmp to parallelize the loops.



Here 3 arrays, maxclus,delmx and entropycalc are shared between the parallel threads and netmap1,netmap2,ptr and random are made private to the threads. What it basically does is that, suppose you have a vector, and two arrays, then,



int* arrayresult = new int [N];
int* array;
#pragma omp parallel shared(arrayresult) private(array)

vector<int> someVec;
array = new int [N]
for(int k=0;k<somenum;k++) array[k] = 0;
#pragma omp for
for(int i=0;i<somenum;i++)


// do something with someVec;
// do something with array;
for(int j=0;j<somenum1;j++)
#pragma omp atomic
arrayresult[j] += someResult;

delete array;



Now this snippet describes the main gist of the code I am posting here. This shows a performance degradation proportional to the number of cores or threads being used. I am providing both the linear code and the parallel code.



How can I make the parallel one more efficient?



Parallel Code with OpenMP



//compile with icpc filename.cpp -o executable -O3 -std=c++14 -qopenmp

#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
#include<omp.h>

//#include "openacc_curand.h"
using namespace std;

int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network

// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)

if(ptr[i]<0) return i;

return ptr[i]=findroot(ptr,ptr[i]);




int main()

int seed,vertices,m,run,filenum,M;



//I am just going to set the initial value for your need
/*
cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 1;

//this sets up the connection and initializes the array;
int con = 0;

for(int i=1;i<seed;i++)

con = con + i;


con = con + (vertices-seed)*m;

//int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
//int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber

//for(int i=1;i<=con;i++)
//
// netmap1[i] = 0;
// netmap2[i] = 0;
//

connectionNumber = con;

srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;

//saved filename

string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";

filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";

ofstream fout1,fout2,fout3;//,fout3;

//int* random = NULL;
//random = new int[connectionNumber+1];

double* maxclus = NULL;
maxclus = new double[vertices+1];

double* delmx = NULL;
delmx = new double[connectionNumber+1];

double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];

for(int i=0;i<=vertices;i++)

maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;


//for(int i=0;i<=connectionNumber;i++)
//
// random[i] = i;
//

//this is the pointer that needs to be made private for all the parallel loops
//int* ptr = new int[vertices+1];
//for(int i=0;i<vertices+1;i++) ptr[i]=0;

//the main program starts here

int* ptr; int* netmap1; int* netmap2; int* random;
int runcounter = 0;

#pragma omp parallel shared(con,runcounter,maxclus,delmx,entropycalc) private(ptr,netmap1,netmap2,random) firstprivate(connectionNumber)


ptr = new int[vertices+1];
netmap1 = new int[connectionNumber+1];
netmap2 = new int[connectionNumber+1];
random = new int[connectionNumber+1];

for(int l=0;l<=con;l++)

netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;


for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;
#pragma omp for schedule(static)
for(int i=1;i<=run;i++)

//#pragma omp critical
//cout<<"run : "<<i<<endl;
//vector<size_t> network;

vector<int> network;

/*for(int l=0;l<=con;l++)

netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;


for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;*/

connectionNumber = 0;

//cout<<network.capacity()<<endl;

//seeds are connected to the network
for(int i=1;i<=seed;i++)

for(int j=1;j<=seed;j++)

if(j>i)


connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);




int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;

for(int i = seed+1;i<=vertices; i++)

countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;

for(int j = 1; ;j++)

concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

for(int n=0;n<m;n++)

if(ab[n] == network[u]) concheck = 1;


//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)

ab[countm] = network[u];
countm=countm+1;



connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];

network.push_back(i);
network.push_back(network[u]);


if(countm==m) break;



//the random list of connection are shuffled


random_shuffle(&random[1],&random[con]);


for(int rx=1;rx<=1;rx++)

int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;

int nodeA=0,nodeB=0;

int indx1=0;

int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;


double entropy = log(vertices);



for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;


for(int i=1;i<=vertices;i++)

if(i!=connectionNumber)



int algaRandomIndex = 0;

for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)


node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;


for(int nodeindex = 0; nodeindex<2*M; nodeindex++)

if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else

int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];



int clusmul[M];
int clusindex1 = 0;

for(int clusindex = 0; clusindex<M; clusindex++)

clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;




bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)

bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;

clusmulCheck = clusmulCheck && clusmulCheck1;


if(clusmulCheck)

nodeA = node[0];
nodeB = node[1];

for(int someK = 1; someK < M; someK++)


int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);



int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;


else

int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)

if(clusmul[as]<low)

low = clusmul[as];
indx1 = as+1;



nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];

int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;

for(int ase = 1; ase < M; ase++)

int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;








if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)


en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)


en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;



else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)


en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)


int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)

if(ptr[g]<ptr[h])


ptr[g] += ptr[h];
ptr[h] = g;
index = g;

else


ptr[h] += ptr[g];
ptr[g] = h;
index = h;

entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else


jump=big-bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;

continue;



if(-ptr[index]>big) big = -ptr[index];

jump = big - bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;

if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;







network.clear();

#pragma omp atomic
runcounter++;

int rem = (runcounter * 100/run) % 5;

if(rem == 0)
cout<<"Progress: "<<(double)runcounter*100/run<<"%"<<endl;



delete ptr;
delete netmap1;
delete netmap2;
delete random;




//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());

connectionNumber = con;

for(int i=1;i<=vertices;i++)

//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;


//fout1.close();
//fout2.close();
//fout3.close();


//delete random;
//random = NULL;
//delete netmap1;
//netmap1 = NULL;
//delete netmap2;
//netmap2 = NULL;
//delete ptr;
//ptr = NULL;

delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;



return 0;




Linear Code



#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
//#include<bits/stdc++.h>
//#include "openacc_curand.h"
using namespace std;
//vector<int> network;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network

// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)

if(ptr[i]<0) return i;

return ptr[i]=findroot(ptr,ptr[i]);


/*#pragma acc routine seq
int findroot(int ptr,int i)

//cao = 1;
int r,s;
r = s = i;
while (ptr[r]>=0)

ptr[s] = ptr[r];
s = r;
r = ptr[r];

return r;
*/

int main()

int seed,vertices,m,run,filenum,M;



//I am just going to set the initial value for your need

/* cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 10;

//this sets up the connection and initializes the array;
int con = 0;

for(int i=1;i<seed;i++)

con = con + i;


con = con + (vertices-seed)*m;

int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber

for(int i=1;i<=con;i++)

netmap1[i] = 0;
netmap2[i] = 0;


connectionNumber = con;

srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;

//saved filename

string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";

filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";

ofstream fout1,fout2,fout3;//,fout3;

int* random = NULL;
random = new int[connectionNumber+1];

double* maxclus = NULL;
maxclus = new double[vertices+1];

double* delmx = NULL;
delmx = new double[connectionNumber+1];

double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];

for(int i=0;i<=vertices;i++)

maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;


for(int i=0;i<=connectionNumber;i++)

random[i] = i;


//this is the pointer that needs to be made private for all the parallel loops
int* ptr = new int[vertices+1];
for(int i=0;i<vertices+1;i++) ptr[i]=0;

//the main program starts here

//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])

for(int i=1;i<=run;i++)

cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
connectionNumber = 0;

//cout<<network.capacity()<<endl;

//seeds are connected to the network
for(int i=1;i<=seed;i++)

for(int j=1;j<=seed;j++)

if(j>i)


connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);




int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;

for(int i = seed+1;i<=vertices; i++)

countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;

for(int j = 1; ;j++)

concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

for(int n=0;n<m;n++)

if(ab[n] == network[u]) concheck = 1;


//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)

ab[countm] = network[u];
countm=countm+1;



connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];

network.push_back(i);
network.push_back(network[u]);


if(countm==m) break;



//the random list of connection are shuffled
random_shuffle(&random[1],&random[connectionNumber]);

double rand_seed = time(NULL);

//this is where the problem lies
//basically i want to make all the rx loops parallel in such a way that every parallel loop will have their own copy of ptr[ ] and random[ ] which they can modify themselves
// this whole part does the 'explosive percolation' and saves the data in maxclus, delmx, entropycalc array of different runs
//#pragma acc update device(maxclus,delmx,entropycalc,netmap1,netmap2)
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
//#pragma acc parallel loop private(ptr[0:vertices+1]) firstprivate(random[0:connectionNumber])
for(int rx=1;rx<=1;rx++)

int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;

int nodeA=0,nodeB=0;

int indx1=0;

int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;


double entropy = log(vertices);

//curandState_t state;
//curand_init(rand_seed*rx,0,0,&state);

for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;

//#pragma acc loop seq
for(int i=1;i<=vertices;i++)

if(i!=connectionNumber)



int algaRandomIndex = 0;

for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)


node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;


for(int nodeindex = 0; nodeindex<2*M; nodeindex++)

if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else

int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];



int clusmul[M];
int clusindex1 = 0;

for(int clusindex = 0; clusindex<M; clusindex++)

clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;




bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)

bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;

clusmulCheck = clusmulCheck && clusmulCheck1;


if(clusmulCheck)

nodeA = node[0];
nodeB = node[1];

for(int someK = 1; someK < M; someK++)


int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);



int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;


else

int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)

if(clusmul[as]<low)

low = clusmul[as];
indx1 = as+1;



nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];

int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;

for(int ase = 1; ase < M; ase++)

int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;








if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)


en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)


en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;



else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)


en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)


int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)

if(ptr[g]<ptr[h])


ptr[g] += ptr[h];
ptr[h] = g;
index = g;

else


ptr[h] += ptr[g];
ptr[g] = h;
index = h;

entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else


jump=big-bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
//#pragma acc atomic
entropycalc[i] += entropy;
bigtemp = big;

continue;



if(-ptr[index]>big) big = -ptr[index];

jump = big - bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
//#pragma acc atomic
if(entropy<0) entropy = 0;
entropycalc[i] += entropy;
bigtemp = big;






//vector<size_t>().swap(network);
//vector<int>().swap(network);
//network.clear();
//network.erase(network.begin(),network.end());
//cout<<network.capacity()<<endl;
network.shrink_to_fit();
//cout<<network.capacity()<<endl;

/*for(int i=0;i<connectionNumber;i++)


cout<<"maxclus: "<<maxclus[i]<<'t'<<"delmx: "<<delmx[i]<<'t'<<"entropy: "<<entropycalc[i]<<'t'<<endl;

*/


//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());

connectionNumber = con;

for(int i=1;i<=vertices;i++)

//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;


//fout1.close();
//fout2.close();
//fout3.close();


delete random;
random = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;

delete netmap1;
netmap1 = NULL;
delete netmap2;
netmap2 = NULL;

delete ptr;
ptr = NULL;

return 0;








share|improve this question

















  • 3




    With a nearly 500 lines long main() function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.
    – hoffmale
    Jul 14 at 14:36

















up vote
1
down vote

favorite












Here I uploaded a very basic Barabasi-Albert network generator and then a percolator which conducts percolation over the network following some rules. I have used openmp to parallelize the loops.



Here 3 arrays, maxclus,delmx and entropycalc are shared between the parallel threads and netmap1,netmap2,ptr and random are made private to the threads. What it basically does is that, suppose you have a vector, and two arrays, then,



int* arrayresult = new int [N];
int* array;
#pragma omp parallel shared(arrayresult) private(array)

vector<int> someVec;
array = new int [N]
for(int k=0;k<somenum;k++) array[k] = 0;
#pragma omp for
for(int i=0;i<somenum;i++)


// do something with someVec;
// do something with array;
for(int j=0;j<somenum1;j++)
#pragma omp atomic
arrayresult[j] += someResult;

delete array;



Now this snippet describes the main gist of the code I am posting here. This shows a performance degradation proportional to the number of cores or threads being used. I am providing both the linear code and the parallel code.



How can I make the parallel one more efficient?



Parallel Code with OpenMP



//compile with icpc filename.cpp -o executable -O3 -std=c++14 -qopenmp

#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
#include<omp.h>

//#include "openacc_curand.h"
using namespace std;

int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network

// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)

if(ptr[i]<0) return i;

return ptr[i]=findroot(ptr,ptr[i]);




int main()

int seed,vertices,m,run,filenum,M;



//I am just going to set the initial value for your need
/*
cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 1;

//this sets up the connection and initializes the array;
int con = 0;

for(int i=1;i<seed;i++)

con = con + i;


con = con + (vertices-seed)*m;

//int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
//int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber

//for(int i=1;i<=con;i++)
//
// netmap1[i] = 0;
// netmap2[i] = 0;
//

connectionNumber = con;

srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;

//saved filename

string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";

filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";

ofstream fout1,fout2,fout3;//,fout3;

//int* random = NULL;
//random = new int[connectionNumber+1];

double* maxclus = NULL;
maxclus = new double[vertices+1];

double* delmx = NULL;
delmx = new double[connectionNumber+1];

double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];

for(int i=0;i<=vertices;i++)

maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;


//for(int i=0;i<=connectionNumber;i++)
//
// random[i] = i;
//

//this is the pointer that needs to be made private for all the parallel loops
//int* ptr = new int[vertices+1];
//for(int i=0;i<vertices+1;i++) ptr[i]=0;

//the main program starts here

int* ptr; int* netmap1; int* netmap2; int* random;
int runcounter = 0;

#pragma omp parallel shared(con,runcounter,maxclus,delmx,entropycalc) private(ptr,netmap1,netmap2,random) firstprivate(connectionNumber)


ptr = new int[vertices+1];
netmap1 = new int[connectionNumber+1];
netmap2 = new int[connectionNumber+1];
random = new int[connectionNumber+1];

for(int l=0;l<=con;l++)

netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;


for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;
#pragma omp for schedule(static)
for(int i=1;i<=run;i++)

//#pragma omp critical
//cout<<"run : "<<i<<endl;
//vector<size_t> network;

vector<int> network;

/*for(int l=0;l<=con;l++)

netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;


for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;*/

connectionNumber = 0;

//cout<<network.capacity()<<endl;

//seeds are connected to the network
for(int i=1;i<=seed;i++)

for(int j=1;j<=seed;j++)

if(j>i)


connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);




int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;

for(int i = seed+1;i<=vertices; i++)

countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;

for(int j = 1; ;j++)

concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

for(int n=0;n<m;n++)

if(ab[n] == network[u]) concheck = 1;


//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)

ab[countm] = network[u];
countm=countm+1;



connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];

network.push_back(i);
network.push_back(network[u]);


if(countm==m) break;



//the random list of connection are shuffled


random_shuffle(&random[1],&random[con]);


for(int rx=1;rx<=1;rx++)

int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;

int nodeA=0,nodeB=0;

int indx1=0;

int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;


double entropy = log(vertices);



for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;


for(int i=1;i<=vertices;i++)

if(i!=connectionNumber)



int algaRandomIndex = 0;

for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)


node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;


for(int nodeindex = 0; nodeindex<2*M; nodeindex++)

if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else

int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];



int clusmul[M];
int clusindex1 = 0;

for(int clusindex = 0; clusindex<M; clusindex++)

clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;




bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)

bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;

clusmulCheck = clusmulCheck && clusmulCheck1;


if(clusmulCheck)

nodeA = node[0];
nodeB = node[1];

for(int someK = 1; someK < M; someK++)


int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);



int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;


else

int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)

if(clusmul[as]<low)

low = clusmul[as];
indx1 = as+1;



nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];

int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;

for(int ase = 1; ase < M; ase++)

int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;








if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)


en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)


en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;



else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)


en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)


int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)

if(ptr[g]<ptr[h])


ptr[g] += ptr[h];
ptr[h] = g;
index = g;

else


ptr[h] += ptr[g];
ptr[g] = h;
index = h;

entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else


jump=big-bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;

continue;



if(-ptr[index]>big) big = -ptr[index];

jump = big - bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;

if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;







network.clear();

#pragma omp atomic
runcounter++;

int rem = (runcounter * 100/run) % 5;

if(rem == 0)
cout<<"Progress: "<<(double)runcounter*100/run<<"%"<<endl;



delete ptr;
delete netmap1;
delete netmap2;
delete random;




//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());

connectionNumber = con;

for(int i=1;i<=vertices;i++)

//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;


//fout1.close();
//fout2.close();
//fout3.close();


//delete random;
//random = NULL;
//delete netmap1;
//netmap1 = NULL;
//delete netmap2;
//netmap2 = NULL;
//delete ptr;
//ptr = NULL;

delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;



return 0;




Linear Code



#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
//#include<bits/stdc++.h>
//#include "openacc_curand.h"
using namespace std;
//vector<int> network;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network

// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)

if(ptr[i]<0) return i;

return ptr[i]=findroot(ptr,ptr[i]);


/*#pragma acc routine seq
int findroot(int ptr,int i)

//cao = 1;
int r,s;
r = s = i;
while (ptr[r]>=0)

ptr[s] = ptr[r];
s = r;
r = ptr[r];

return r;
*/

int main()

int seed,vertices,m,run,filenum,M;



//I am just going to set the initial value for your need

/* cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 10;

//this sets up the connection and initializes the array;
int con = 0;

for(int i=1;i<seed;i++)

con = con + i;


con = con + (vertices-seed)*m;

int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber

for(int i=1;i<=con;i++)

netmap1[i] = 0;
netmap2[i] = 0;


connectionNumber = con;

srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;

//saved filename

string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";

filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";

ofstream fout1,fout2,fout3;//,fout3;

int* random = NULL;
random = new int[connectionNumber+1];

double* maxclus = NULL;
maxclus = new double[vertices+1];

double* delmx = NULL;
delmx = new double[connectionNumber+1];

double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];

for(int i=0;i<=vertices;i++)

maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;


for(int i=0;i<=connectionNumber;i++)

random[i] = i;


//this is the pointer that needs to be made private for all the parallel loops
int* ptr = new int[vertices+1];
for(int i=0;i<vertices+1;i++) ptr[i]=0;

//the main program starts here

//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])

for(int i=1;i<=run;i++)

cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
connectionNumber = 0;

//cout<<network.capacity()<<endl;

//seeds are connected to the network
for(int i=1;i<=seed;i++)

for(int j=1;j<=seed;j++)

if(j>i)


connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);




int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;

for(int i = seed+1;i<=vertices; i++)

countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;

for(int j = 1; ;j++)

concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

for(int n=0;n<m;n++)

if(ab[n] == network[u]) concheck = 1;


//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)

ab[countm] = network[u];
countm=countm+1;



connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];

network.push_back(i);
network.push_back(network[u]);


if(countm==m) break;



//the random list of connection are shuffled
random_shuffle(&random[1],&random[connectionNumber]);

double rand_seed = time(NULL);

//this is where the problem lies
//basically i want to make all the rx loops parallel in such a way that every parallel loop will have their own copy of ptr[ ] and random[ ] which they can modify themselves
// this whole part does the 'explosive percolation' and saves the data in maxclus, delmx, entropycalc array of different runs
//#pragma acc update device(maxclus,delmx,entropycalc,netmap1,netmap2)
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
//#pragma acc parallel loop private(ptr[0:vertices+1]) firstprivate(random[0:connectionNumber])
for(int rx=1;rx<=1;rx++)

int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;

int nodeA=0,nodeB=0;

int indx1=0;

int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;


double entropy = log(vertices);

//curandState_t state;
//curand_init(rand_seed*rx,0,0,&state);

for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;

//#pragma acc loop seq
for(int i=1;i<=vertices;i++)

if(i!=connectionNumber)



int algaRandomIndex = 0;

for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)


node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;


for(int nodeindex = 0; nodeindex<2*M; nodeindex++)

if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else

int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];



int clusmul[M];
int clusindex1 = 0;

for(int clusindex = 0; clusindex<M; clusindex++)

clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;




bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)

bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;

clusmulCheck = clusmulCheck && clusmulCheck1;


if(clusmulCheck)

nodeA = node[0];
nodeB = node[1];

for(int someK = 1; someK < M; someK++)


int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);



int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;


else

int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)

if(clusmul[as]<low)

low = clusmul[as];
indx1 = as+1;



nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];

int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;

for(int ase = 1; ase < M; ase++)

int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;








if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)


en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)


en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;



else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)


en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)


int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)

if(ptr[g]<ptr[h])


ptr[g] += ptr[h];
ptr[h] = g;
index = g;

else


ptr[h] += ptr[g];
ptr[g] = h;
index = h;

entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else


jump=big-bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
//#pragma acc atomic
entropycalc[i] += entropy;
bigtemp = big;

continue;



if(-ptr[index]>big) big = -ptr[index];

jump = big - bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
//#pragma acc atomic
if(entropy<0) entropy = 0;
entropycalc[i] += entropy;
bigtemp = big;






//vector<size_t>().swap(network);
//vector<int>().swap(network);
//network.clear();
//network.erase(network.begin(),network.end());
//cout<<network.capacity()<<endl;
network.shrink_to_fit();
//cout<<network.capacity()<<endl;

/*for(int i=0;i<connectionNumber;i++)


cout<<"maxclus: "<<maxclus[i]<<'t'<<"delmx: "<<delmx[i]<<'t'<<"entropy: "<<entropycalc[i]<<'t'<<endl;

*/


//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());

connectionNumber = con;

for(int i=1;i<=vertices;i++)

//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;


//fout1.close();
//fout2.close();
//fout3.close();


delete random;
random = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;

delete netmap1;
netmap1 = NULL;
delete netmap2;
netmap2 = NULL;

delete ptr;
ptr = NULL;

return 0;








share|improve this question

















  • 3




    With a nearly 500 lines long main() function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.
    – hoffmale
    Jul 14 at 14:36













up vote
1
down vote

favorite









up vote
1
down vote

favorite











Here I uploaded a very basic Barabasi-Albert network generator and then a percolator which conducts percolation over the network following some rules. I have used openmp to parallelize the loops.



Here 3 arrays, maxclus,delmx and entropycalc are shared between the parallel threads and netmap1,netmap2,ptr and random are made private to the threads. What it basically does is that, suppose you have a vector, and two arrays, then,



int* arrayresult = new int [N];
int* array;
#pragma omp parallel shared(arrayresult) private(array)

vector<int> someVec;
array = new int [N]
for(int k=0;k<somenum;k++) array[k] = 0;
#pragma omp for
for(int i=0;i<somenum;i++)


// do something with someVec;
// do something with array;
for(int j=0;j<somenum1;j++)
#pragma omp atomic
arrayresult[j] += someResult;

delete array;



Now this snippet describes the main gist of the code I am posting here. This shows a performance degradation proportional to the number of cores or threads being used. I am providing both the linear code and the parallel code.



How can I make the parallel one more efficient?



Parallel Code with OpenMP



//compile with icpc filename.cpp -o executable -O3 -std=c++14 -qopenmp

#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
#include<omp.h>

//#include "openacc_curand.h"
using namespace std;

int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network

// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)

if(ptr[i]<0) return i;

return ptr[i]=findroot(ptr,ptr[i]);




int main()

int seed,vertices,m,run,filenum,M;



//I am just going to set the initial value for your need
/*
cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 1;

//this sets up the connection and initializes the array;
int con = 0;

for(int i=1;i<seed;i++)

con = con + i;


con = con + (vertices-seed)*m;

//int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
//int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber

//for(int i=1;i<=con;i++)
//
// netmap1[i] = 0;
// netmap2[i] = 0;
//

connectionNumber = con;

srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;

//saved filename

string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";

filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";

ofstream fout1,fout2,fout3;//,fout3;

//int* random = NULL;
//random = new int[connectionNumber+1];

double* maxclus = NULL;
maxclus = new double[vertices+1];

double* delmx = NULL;
delmx = new double[connectionNumber+1];

double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];

for(int i=0;i<=vertices;i++)

maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;


//for(int i=0;i<=connectionNumber;i++)
//
// random[i] = i;
//

//this is the pointer that needs to be made private for all the parallel loops
//int* ptr = new int[vertices+1];
//for(int i=0;i<vertices+1;i++) ptr[i]=0;

//the main program starts here

int* ptr; int* netmap1; int* netmap2; int* random;
int runcounter = 0;

#pragma omp parallel shared(con,runcounter,maxclus,delmx,entropycalc) private(ptr,netmap1,netmap2,random) firstprivate(connectionNumber)


ptr = new int[vertices+1];
netmap1 = new int[connectionNumber+1];
netmap2 = new int[connectionNumber+1];
random = new int[connectionNumber+1];

for(int l=0;l<=con;l++)

netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;


for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;
#pragma omp for schedule(static)
for(int i=1;i<=run;i++)

//#pragma omp critical
//cout<<"run : "<<i<<endl;
//vector<size_t> network;

vector<int> network;

/*for(int l=0;l<=con;l++)

netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;


for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;*/

connectionNumber = 0;

//cout<<network.capacity()<<endl;

//seeds are connected to the network
for(int i=1;i<=seed;i++)

for(int j=1;j<=seed;j++)

if(j>i)


connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);




int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;

for(int i = seed+1;i<=vertices; i++)

countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;

for(int j = 1; ;j++)

concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

for(int n=0;n<m;n++)

if(ab[n] == network[u]) concheck = 1;


//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)

ab[countm] = network[u];
countm=countm+1;



connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];

network.push_back(i);
network.push_back(network[u]);


if(countm==m) break;



//the random list of connection are shuffled


random_shuffle(&random[1],&random[con]);


for(int rx=1;rx<=1;rx++)

int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;

int nodeA=0,nodeB=0;

int indx1=0;

int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;


double entropy = log(vertices);



for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;


for(int i=1;i<=vertices;i++)

if(i!=connectionNumber)



int algaRandomIndex = 0;

for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)


node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;


for(int nodeindex = 0; nodeindex<2*M; nodeindex++)

if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else

int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];



int clusmul[M];
int clusindex1 = 0;

for(int clusindex = 0; clusindex<M; clusindex++)

clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;




bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)

bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;

clusmulCheck = clusmulCheck && clusmulCheck1;


if(clusmulCheck)

nodeA = node[0];
nodeB = node[1];

for(int someK = 1; someK < M; someK++)


int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);



int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;


else

int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)

if(clusmul[as]<low)

low = clusmul[as];
indx1 = as+1;



nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];

int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;

for(int ase = 1; ase < M; ase++)

int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;








if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)


en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)


en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;



else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)


en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)


int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)

if(ptr[g]<ptr[h])


ptr[g] += ptr[h];
ptr[h] = g;
index = g;

else


ptr[h] += ptr[g];
ptr[g] = h;
index = h;

entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else


jump=big-bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;

continue;



if(-ptr[index]>big) big = -ptr[index];

jump = big - bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;

if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;







network.clear();

#pragma omp atomic
runcounter++;

int rem = (runcounter * 100/run) % 5;

if(rem == 0)
cout<<"Progress: "<<(double)runcounter*100/run<<"%"<<endl;



delete ptr;
delete netmap1;
delete netmap2;
delete random;




//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());

connectionNumber = con;

for(int i=1;i<=vertices;i++)

//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;


//fout1.close();
//fout2.close();
//fout3.close();


//delete random;
//random = NULL;
//delete netmap1;
//netmap1 = NULL;
//delete netmap2;
//netmap2 = NULL;
//delete ptr;
//ptr = NULL;

delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;



return 0;




Linear Code



#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
//#include<bits/stdc++.h>
//#include "openacc_curand.h"
using namespace std;
//vector<int> network;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network

// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)

if(ptr[i]<0) return i;

return ptr[i]=findroot(ptr,ptr[i]);


/*#pragma acc routine seq
int findroot(int ptr,int i)

//cao = 1;
int r,s;
r = s = i;
while (ptr[r]>=0)

ptr[s] = ptr[r];
s = r;
r = ptr[r];

return r;
*/

int main()

int seed,vertices,m,run,filenum,M;



//I am just going to set the initial value for your need

/* cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 10;

//this sets up the connection and initializes the array;
int con = 0;

for(int i=1;i<seed;i++)

con = con + i;


con = con + (vertices-seed)*m;

int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber

for(int i=1;i<=con;i++)

netmap1[i] = 0;
netmap2[i] = 0;


connectionNumber = con;

srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;

//saved filename

string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";

filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";

ofstream fout1,fout2,fout3;//,fout3;

int* random = NULL;
random = new int[connectionNumber+1];

double* maxclus = NULL;
maxclus = new double[vertices+1];

double* delmx = NULL;
delmx = new double[connectionNumber+1];

double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];

for(int i=0;i<=vertices;i++)

maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;


for(int i=0;i<=connectionNumber;i++)

random[i] = i;


//this is the pointer that needs to be made private for all the parallel loops
int* ptr = new int[vertices+1];
for(int i=0;i<vertices+1;i++) ptr[i]=0;

//the main program starts here

//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])

for(int i=1;i<=run;i++)

cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
connectionNumber = 0;

//cout<<network.capacity()<<endl;

//seeds are connected to the network
for(int i=1;i<=seed;i++)

for(int j=1;j<=seed;j++)

if(j>i)


connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);




int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;

for(int i = seed+1;i<=vertices; i++)

countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;

for(int j = 1; ;j++)

concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

for(int n=0;n<m;n++)

if(ab[n] == network[u]) concheck = 1;


//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)

ab[countm] = network[u];
countm=countm+1;



connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];

network.push_back(i);
network.push_back(network[u]);


if(countm==m) break;



//the random list of connection are shuffled
random_shuffle(&random[1],&random[connectionNumber]);

double rand_seed = time(NULL);

//this is where the problem lies
//basically i want to make all the rx loops parallel in such a way that every parallel loop will have their own copy of ptr[ ] and random[ ] which they can modify themselves
// this whole part does the 'explosive percolation' and saves the data in maxclus, delmx, entropycalc array of different runs
//#pragma acc update device(maxclus,delmx,entropycalc,netmap1,netmap2)
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
//#pragma acc parallel loop private(ptr[0:vertices+1]) firstprivate(random[0:connectionNumber])
for(int rx=1;rx<=1;rx++)

int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;

int nodeA=0,nodeB=0;

int indx1=0;

int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;


double entropy = log(vertices);

//curandState_t state;
//curand_init(rand_seed*rx,0,0,&state);

for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;

//#pragma acc loop seq
for(int i=1;i<=vertices;i++)

if(i!=connectionNumber)



int algaRandomIndex = 0;

for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)


node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;


for(int nodeindex = 0; nodeindex<2*M; nodeindex++)

if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else

int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];



int clusmul[M];
int clusindex1 = 0;

for(int clusindex = 0; clusindex<M; clusindex++)

clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;




bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)

bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;

clusmulCheck = clusmulCheck && clusmulCheck1;


if(clusmulCheck)

nodeA = node[0];
nodeB = node[1];

for(int someK = 1; someK < M; someK++)


int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);



int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;


else

int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)

if(clusmul[as]<low)

low = clusmul[as];
indx1 = as+1;



nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];

int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;

for(int ase = 1; ase < M; ase++)

int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;








if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)


en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)


en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;



else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)


en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)


int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)

if(ptr[g]<ptr[h])


ptr[g] += ptr[h];
ptr[h] = g;
index = g;

else


ptr[h] += ptr[g];
ptr[g] = h;
index = h;

entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else


jump=big-bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
//#pragma acc atomic
entropycalc[i] += entropy;
bigtemp = big;

continue;



if(-ptr[index]>big) big = -ptr[index];

jump = big - bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
//#pragma acc atomic
if(entropy<0) entropy = 0;
entropycalc[i] += entropy;
bigtemp = big;






//vector<size_t>().swap(network);
//vector<int>().swap(network);
//network.clear();
//network.erase(network.begin(),network.end());
//cout<<network.capacity()<<endl;
network.shrink_to_fit();
//cout<<network.capacity()<<endl;

/*for(int i=0;i<connectionNumber;i++)


cout<<"maxclus: "<<maxclus[i]<<'t'<<"delmx: "<<delmx[i]<<'t'<<"entropy: "<<entropycalc[i]<<'t'<<endl;

*/


//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());

connectionNumber = con;

for(int i=1;i<=vertices;i++)

//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;


//fout1.close();
//fout2.close();
//fout3.close();


delete random;
random = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;

delete netmap1;
netmap1 = NULL;
delete netmap2;
netmap2 = NULL;

delete ptr;
ptr = NULL;

return 0;








share|improve this question













Here I uploaded a very basic Barabasi-Albert network generator and then a percolator which conducts percolation over the network following some rules. I have used openmp to parallelize the loops.



Here 3 arrays, maxclus,delmx and entropycalc are shared between the parallel threads and netmap1,netmap2,ptr and random are made private to the threads. What it basically does is that, suppose you have a vector, and two arrays, then,



int* arrayresult = new int [N];
int* array;
#pragma omp parallel shared(arrayresult) private(array)

vector<int> someVec;
array = new int [N]
for(int k=0;k<somenum;k++) array[k] = 0;
#pragma omp for
for(int i=0;i<somenum;i++)


// do something with someVec;
// do something with array;
for(int j=0;j<somenum1;j++)
#pragma omp atomic
arrayresult[j] += someResult;

delete array;



Now this snippet describes the main gist of the code I am posting here. This shows a performance degradation proportional to the number of cores or threads being used. I am providing both the linear code and the parallel code.



How can I make the parallel one more efficient?



Parallel Code with OpenMP



//compile with icpc filename.cpp -o executable -O3 -std=c++14 -qopenmp

#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
#include<omp.h>

//#include "openacc_curand.h"
using namespace std;

int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network

// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)

if(ptr[i]<0) return i;

return ptr[i]=findroot(ptr,ptr[i]);




int main()

int seed,vertices,m,run,filenum,M;



//I am just going to set the initial value for your need
/*
cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 1;

//this sets up the connection and initializes the array;
int con = 0;

for(int i=1;i<seed;i++)

con = con + i;


con = con + (vertices-seed)*m;

//int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
//int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber

//for(int i=1;i<=con;i++)
//
// netmap1[i] = 0;
// netmap2[i] = 0;
//

connectionNumber = con;

srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;

//saved filename

string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";

filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";

ofstream fout1,fout2,fout3;//,fout3;

//int* random = NULL;
//random = new int[connectionNumber+1];

double* maxclus = NULL;
maxclus = new double[vertices+1];

double* delmx = NULL;
delmx = new double[connectionNumber+1];

double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];

for(int i=0;i<=vertices;i++)

maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;


//for(int i=0;i<=connectionNumber;i++)
//
// random[i] = i;
//

//this is the pointer that needs to be made private for all the parallel loops
//int* ptr = new int[vertices+1];
//for(int i=0;i<vertices+1;i++) ptr[i]=0;

//the main program starts here

int* ptr; int* netmap1; int* netmap2; int* random;
int runcounter = 0;

#pragma omp parallel shared(con,runcounter,maxclus,delmx,entropycalc) private(ptr,netmap1,netmap2,random) firstprivate(connectionNumber)


ptr = new int[vertices+1];
netmap1 = new int[connectionNumber+1];
netmap2 = new int[connectionNumber+1];
random = new int[connectionNumber+1];

for(int l=0;l<=con;l++)

netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;


for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;
#pragma omp for schedule(static)
for(int i=1;i<=run;i++)

//#pragma omp critical
//cout<<"run : "<<i<<endl;
//vector<size_t> network;

vector<int> network;

/*for(int l=0;l<=con;l++)

netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;


for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;*/

connectionNumber = 0;

//cout<<network.capacity()<<endl;

//seeds are connected to the network
for(int i=1;i<=seed;i++)

for(int j=1;j<=seed;j++)

if(j>i)


connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);




int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;

for(int i = seed+1;i<=vertices; i++)

countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;

for(int j = 1; ;j++)

concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

for(int n=0;n<m;n++)

if(ab[n] == network[u]) concheck = 1;


//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)

ab[countm] = network[u];
countm=countm+1;



connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];

network.push_back(i);
network.push_back(network[u]);


if(countm==m) break;



//the random list of connection are shuffled


random_shuffle(&random[1],&random[con]);


for(int rx=1;rx<=1;rx++)

int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;

int nodeA=0,nodeB=0;

int indx1=0;

int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;


double entropy = log(vertices);



for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;


for(int i=1;i<=vertices;i++)

if(i!=connectionNumber)



int algaRandomIndex = 0;

for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)


node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;


for(int nodeindex = 0; nodeindex<2*M; nodeindex++)

if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else

int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];



int clusmul[M];
int clusindex1 = 0;

for(int clusindex = 0; clusindex<M; clusindex++)

clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;




bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)

bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;

clusmulCheck = clusmulCheck && clusmulCheck1;


if(clusmulCheck)

nodeA = node[0];
nodeB = node[1];

for(int someK = 1; someK < M; someK++)


int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);



int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;


else

int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)

if(clusmul[as]<low)

low = clusmul[as];
indx1 = as+1;



nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];

int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;

for(int ase = 1; ase < M; ase++)

int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;








if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)


en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)


en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;



else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)


en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)


int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)

if(ptr[g]<ptr[h])


ptr[g] += ptr[h];
ptr[h] = g;
index = g;

else


ptr[h] += ptr[g];
ptr[g] = h;
index = h;

entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else


jump=big-bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;

continue;



if(-ptr[index]>big) big = -ptr[index];

jump = big - bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;

if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;







network.clear();

#pragma omp atomic
runcounter++;

int rem = (runcounter * 100/run) % 5;

if(rem == 0)
cout<<"Progress: "<<(double)runcounter*100/run<<"%"<<endl;



delete ptr;
delete netmap1;
delete netmap2;
delete random;




//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());

connectionNumber = con;

for(int i=1;i<=vertices;i++)

//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;


//fout1.close();
//fout2.close();
//fout3.close();


//delete random;
//random = NULL;
//delete netmap1;
//netmap1 = NULL;
//delete netmap2;
//netmap2 = NULL;
//delete ptr;
//ptr = NULL;

delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;



return 0;




Linear Code



#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
//#include<bits/stdc++.h>
//#include "openacc_curand.h"
using namespace std;
//vector<int> network;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network

// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)

if(ptr[i]<0) return i;

return ptr[i]=findroot(ptr,ptr[i]);


/*#pragma acc routine seq
int findroot(int ptr,int i)

//cao = 1;
int r,s;
r = s = i;
while (ptr[r]>=0)

ptr[s] = ptr[r];
s = r;
r = ptr[r];

return r;
*/

int main()

int seed,vertices,m,run,filenum,M;



//I am just going to set the initial value for your need

/* cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 10;

//this sets up the connection and initializes the array;
int con = 0;

for(int i=1;i<seed;i++)

con = con + i;


con = con + (vertices-seed)*m;

int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber

for(int i=1;i<=con;i++)

netmap1[i] = 0;
netmap2[i] = 0;


connectionNumber = con;

srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;

//saved filename

string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";

filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";

ofstream fout1,fout2,fout3;//,fout3;

int* random = NULL;
random = new int[connectionNumber+1];

double* maxclus = NULL;
maxclus = new double[vertices+1];

double* delmx = NULL;
delmx = new double[connectionNumber+1];

double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];

for(int i=0;i<=vertices;i++)

maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;


for(int i=0;i<=connectionNumber;i++)

random[i] = i;


//this is the pointer that needs to be made private for all the parallel loops
int* ptr = new int[vertices+1];
for(int i=0;i<vertices+1;i++) ptr[i]=0;

//the main program starts here

//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])

for(int i=1;i<=run;i++)

cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
connectionNumber = 0;

//cout<<network.capacity()<<endl;

//seeds are connected to the network
for(int i=1;i<=seed;i++)

for(int j=1;j<=seed;j++)

if(j>i)


connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);




int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;

for(int i = seed+1;i<=vertices; i++)

countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;

for(int j = 1; ;j++)

concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

for(int n=0;n<m;n++)

if(ab[n] == network[u]) concheck = 1;


//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)

ab[countm] = network[u];
countm=countm+1;



connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];

network.push_back(i);
network.push_back(network[u]);


if(countm==m) break;



//the random list of connection are shuffled
random_shuffle(&random[1],&random[connectionNumber]);

double rand_seed = time(NULL);

//this is where the problem lies
//basically i want to make all the rx loops parallel in such a way that every parallel loop will have their own copy of ptr[ ] and random[ ] which they can modify themselves
// this whole part does the 'explosive percolation' and saves the data in maxclus, delmx, entropycalc array of different runs
//#pragma acc update device(maxclus,delmx,entropycalc,netmap1,netmap2)
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
//#pragma acc parallel loop private(ptr[0:vertices+1]) firstprivate(random[0:connectionNumber])
for(int rx=1;rx<=1;rx++)

int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;

int nodeA=0,nodeB=0;

int indx1=0;

int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;


double entropy = log(vertices);

//curandState_t state;
//curand_init(rand_seed*rx,0,0,&state);

for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;

//#pragma acc loop seq
for(int i=1;i<=vertices;i++)

if(i!=connectionNumber)



int algaRandomIndex = 0;

for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)


node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;


for(int nodeindex = 0; nodeindex<2*M; nodeindex++)

if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else

int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];



int clusmul[M];
int clusindex1 = 0;

for(int clusindex = 0; clusindex<M; clusindex++)

clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;




bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)

bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;

clusmulCheck = clusmulCheck && clusmulCheck1;


if(clusmulCheck)

nodeA = node[0];
nodeB = node[1];

for(int someK = 1; someK < M; someK++)


int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);



int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;


else

int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)

if(clusmul[as]<low)

low = clusmul[as];
indx1 = as+1;



nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];

int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;

for(int ase = 1; ase < M; ase++)

int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);

int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;








if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)


en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)


en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;



else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)


en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)


int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)

if(ptr[g]<ptr[h])


ptr[g] += ptr[h];
ptr[h] = g;
index = g;

else


ptr[h] += ptr[g];
ptr[g] = h;
index = h;

entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;

else


jump=big-bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
//#pragma acc atomic
entropycalc[i] += entropy;
bigtemp = big;

continue;



if(-ptr[index]>big) big = -ptr[index];

jump = big - bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
//#pragma acc atomic
if(entropy<0) entropy = 0;
entropycalc[i] += entropy;
bigtemp = big;






//vector<size_t>().swap(network);
//vector<int>().swap(network);
//network.clear();
//network.erase(network.begin(),network.end());
//cout<<network.capacity()<<endl;
network.shrink_to_fit();
//cout<<network.capacity()<<endl;

/*for(int i=0;i<connectionNumber;i++)


cout<<"maxclus: "<<maxclus[i]<<'t'<<"delmx: "<<delmx[i]<<'t'<<"entropy: "<<entropycalc[i]<<'t'<<endl;

*/


//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());

connectionNumber = con;

for(int i=1;i<=vertices;i++)

//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;


//fout1.close();
//fout2.close();
//fout3.close();


delete random;
random = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;

delete netmap1;
netmap1 = NULL;
delete netmap2;
netmap2 = NULL;

delete ptr;
ptr = NULL;

return 0;










share|improve this question












share|improve this question




share|improve this question








edited Jul 14 at 13:59









Stephen Rauch

3,49951430




3,49951430









asked Jul 14 at 7:57









Digonto Islam

354




354







  • 3




    With a nearly 500 lines long main() function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.
    – hoffmale
    Jul 14 at 14:36













  • 3




    With a nearly 500 lines long main() function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.
    – hoffmale
    Jul 14 at 14:36








3




3




With a nearly 500 lines long main() function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.
– hoffmale
Jul 14 at 14:36





With a nearly 500 lines long main() function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.
– hoffmale
Jul 14 at 14:36











1 Answer
1






active

oldest

votes

















up vote
3
down vote



accepted










I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand() inside the parallelized loop.



Each call to rand() not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.






share|improve this answer





















  • That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am using maxclus,delmx and entropycalc as the shared arrays and ptr , random, netmap1, netmap2 and a vector network as private, is that making any performance degradation?
    – Digonto Islam
    Jul 17 at 9:54










Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f198479%2fba-network-generator-and-percolator-with-openmp-and-also-linear-code%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
3
down vote



accepted










I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand() inside the parallelized loop.



Each call to rand() not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.






share|improve this answer





















  • That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am using maxclus,delmx and entropycalc as the shared arrays and ptr , random, netmap1, netmap2 and a vector network as private, is that making any performance degradation?
    – Digonto Islam
    Jul 17 at 9:54














up vote
3
down vote



accepted










I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand() inside the parallelized loop.



Each call to rand() not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.






share|improve this answer





















  • That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am using maxclus,delmx and entropycalc as the shared arrays and ptr , random, netmap1, netmap2 and a vector network as private, is that making any performance degradation?
    – Digonto Islam
    Jul 17 at 9:54












up vote
3
down vote



accepted







up vote
3
down vote



accepted






I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand() inside the parallelized loop.



Each call to rand() not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.






share|improve this answer













I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand() inside the parallelized loop.



Each call to rand() not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.







share|improve this answer













share|improve this answer



share|improve this answer











answered Jul 15 at 5:14









Jerry Coffin

27.3k360123




27.3k360123











  • That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am using maxclus,delmx and entropycalc as the shared arrays and ptr , random, netmap1, netmap2 and a vector network as private, is that making any performance degradation?
    – Digonto Islam
    Jul 17 at 9:54
















  • That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am using maxclus,delmx and entropycalc as the shared arrays and ptr , random, netmap1, netmap2 and a vector network as private, is that making any performance degradation?
    – Digonto Islam
    Jul 17 at 9:54















That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am using maxclus,delmx and entropycalc as the shared arrays and ptr , random, netmap1, netmap2 and a vector network as private, is that making any performance degradation?
– Digonto Islam
Jul 17 at 9:54




That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am using maxclus,delmx and entropycalc as the shared arrays and ptr , random, netmap1, netmap2 and a vector network as private, is that making any performance degradation?
– Digonto Islam
Jul 17 at 9:54












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f198479%2fba-network-generator-and-percolator-with-openmp-and-also-linear-code%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Chat program with C++ and SFML

Function to Return a JSON Like Objects Using VBA Collections and Arrays

Will my employers contract hold up in court?