BA network generator and Percolator with OpenMP and also Linear code
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
1
down vote
favorite
Here I uploaded a very basic Barabasi-Albert network generator and then a percolator which conducts percolation over the network following some rules. I have used openmp to parallelize the loops.
Here 3 arrays, maxclus
,delmx
and entropycalc
are shared between the parallel threads and netmap1
,netmap2
,ptr
and random
are made private to the threads. What it basically does is that, suppose you have a vector, and two arrays, then,
int* arrayresult = new int [N];
int* array;
#pragma omp parallel shared(arrayresult) private(array)
vector<int> someVec;
array = new int [N]
for(int k=0;k<somenum;k++) array[k] = 0;
#pragma omp for
for(int i=0;i<somenum;i++)
// do something with someVec;
// do something with array;
for(int j=0;j<somenum1;j++)
#pragma omp atomic
arrayresult[j] += someResult;
delete array;
Now this snippet describes the main gist of the code I am posting here. This shows a performance degradation proportional to the number of cores or threads being used. I am providing both the linear code and the parallel code.
How can I make the parallel one more efficient?
Parallel Code with OpenMP
//compile with icpc filename.cpp -o executable -O3 -std=c++14 -qopenmp
#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
#include<omp.h>
//#include "openacc_curand.h"
using namespace std;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network
// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)
if(ptr[i]<0) return i;
return ptr[i]=findroot(ptr,ptr[i]);
int main()
int seed,vertices,m,run,filenum,M;
//I am just going to set the initial value for your need
/*
cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 1;
//this sets up the connection and initializes the array;
int con = 0;
for(int i=1;i<seed;i++)
con = con + i;
con = con + (vertices-seed)*m;
//int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
//int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber
//for(int i=1;i<=con;i++)
//
// netmap1[i] = 0;
// netmap2[i] = 0;
//
connectionNumber = con;
srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;
//saved filename
string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";
filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
ofstream fout1,fout2,fout3;//,fout3;
//int* random = NULL;
//random = new int[connectionNumber+1];
double* maxclus = NULL;
maxclus = new double[vertices+1];
double* delmx = NULL;
delmx = new double[connectionNumber+1];
double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];
for(int i=0;i<=vertices;i++)
maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;
//for(int i=0;i<=connectionNumber;i++)
//
// random[i] = i;
//
//this is the pointer that needs to be made private for all the parallel loops
//int* ptr = new int[vertices+1];
//for(int i=0;i<vertices+1;i++) ptr[i]=0;
//the main program starts here
int* ptr; int* netmap1; int* netmap2; int* random;
int runcounter = 0;
#pragma omp parallel shared(con,runcounter,maxclus,delmx,entropycalc) private(ptr,netmap1,netmap2,random) firstprivate(connectionNumber)
ptr = new int[vertices+1];
netmap1 = new int[connectionNumber+1];
netmap2 = new int[connectionNumber+1];
random = new int[connectionNumber+1];
for(int l=0;l<=con;l++)
netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;
for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;
#pragma omp for schedule(static)
for(int i=1;i<=run;i++)
//#pragma omp critical
//cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
/*for(int l=0;l<=con;l++)
netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;
for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;*/
connectionNumber = 0;
//cout<<network.capacity()<<endl;
//seeds are connected to the network
for(int i=1;i<=seed;i++)
for(int j=1;j<=seed;j++)
if(j>i)
connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);
int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;
for(int i = seed+1;i<=vertices; i++)
countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;
for(int j = 1; ;j++)
concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
for(int n=0;n<m;n++)
if(ab[n] == network[u]) concheck = 1;
//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)
ab[countm] = network[u];
countm=countm+1;
connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];
network.push_back(i);
network.push_back(network[u]);
if(countm==m) break;
//the random list of connection are shuffled
random_shuffle(&random[1],&random[con]);
for(int rx=1;rx<=1;rx++)
int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;
int nodeA=0,nodeB=0;
int indx1=0;
int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;
double entropy = log(vertices);
for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;
for(int i=1;i<=vertices;i++)
if(i!=connectionNumber)
int algaRandomIndex = 0;
for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)
node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;
for(int nodeindex = 0; nodeindex<2*M; nodeindex++)
if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else
int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];
int clusmul[M];
int clusindex1 = 0;
for(int clusindex = 0; clusindex<M; clusindex++)
clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;
bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)
bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;
clusmulCheck = clusmulCheck && clusmulCheck1;
if(clusmulCheck)
nodeA = node[0];
nodeB = node[1];
for(int someK = 1; someK < M; someK++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;
else
int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)
if(clusmul[as]<low)
low = clusmul[as];
indx1 = as+1;
nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];
int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;
for(int ase = 1; ase < M; ase++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;
if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)
en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)
en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)
en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)
int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)
if(ptr[g]<ptr[h])
ptr[g] += ptr[h];
ptr[h] = g;
index = g;
else
ptr[h] += ptr[g];
ptr[g] = h;
index = h;
entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else
jump=big-bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;
continue;
if(-ptr[index]>big) big = -ptr[index];
jump = big - bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;
network.clear();
#pragma omp atomic
runcounter++;
int rem = (runcounter * 100/run) % 5;
if(rem == 0)
cout<<"Progress: "<<(double)runcounter*100/run<<"%"<<endl;
delete ptr;
delete netmap1;
delete netmap2;
delete random;
//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());
connectionNumber = con;
for(int i=1;i<=vertices;i++)
//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;
//fout1.close();
//fout2.close();
//fout3.close();
//delete random;
//random = NULL;
//delete netmap1;
//netmap1 = NULL;
//delete netmap2;
//netmap2 = NULL;
//delete ptr;
//ptr = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;
return 0;
Linear Code
#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
//#include<bits/stdc++.h>
//#include "openacc_curand.h"
using namespace std;
//vector<int> network;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network
// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)
if(ptr[i]<0) return i;
return ptr[i]=findroot(ptr,ptr[i]);
/*#pragma acc routine seq
int findroot(int ptr,int i)
//cao = 1;
int r,s;
r = s = i;
while (ptr[r]>=0)
ptr[s] = ptr[r];
s = r;
r = ptr[r];
return r;
*/
int main()
int seed,vertices,m,run,filenum,M;
//I am just going to set the initial value for your need
/* cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 10;
//this sets up the connection and initializes the array;
int con = 0;
for(int i=1;i<seed;i++)
con = con + i;
con = con + (vertices-seed)*m;
int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber
for(int i=1;i<=con;i++)
netmap1[i] = 0;
netmap2[i] = 0;
connectionNumber = con;
srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;
//saved filename
string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";
filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
ofstream fout1,fout2,fout3;//,fout3;
int* random = NULL;
random = new int[connectionNumber+1];
double* maxclus = NULL;
maxclus = new double[vertices+1];
double* delmx = NULL;
delmx = new double[connectionNumber+1];
double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];
for(int i=0;i<=vertices;i++)
maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;
for(int i=0;i<=connectionNumber;i++)
random[i] = i;
//this is the pointer that needs to be made private for all the parallel loops
int* ptr = new int[vertices+1];
for(int i=0;i<vertices+1;i++) ptr[i]=0;
//the main program starts here
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
for(int i=1;i<=run;i++)
cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
connectionNumber = 0;
//cout<<network.capacity()<<endl;
//seeds are connected to the network
for(int i=1;i<=seed;i++)
for(int j=1;j<=seed;j++)
if(j>i)
connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);
int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;
for(int i = seed+1;i<=vertices; i++)
countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;
for(int j = 1; ;j++)
concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
for(int n=0;n<m;n++)
if(ab[n] == network[u]) concheck = 1;
//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)
ab[countm] = network[u];
countm=countm+1;
connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];
network.push_back(i);
network.push_back(network[u]);
if(countm==m) break;
//the random list of connection are shuffled
random_shuffle(&random[1],&random[connectionNumber]);
double rand_seed = time(NULL);
//this is where the problem lies
//basically i want to make all the rx loops parallel in such a way that every parallel loop will have their own copy of ptr[ ] and random[ ] which they can modify themselves
// this whole part does the 'explosive percolation' and saves the data in maxclus, delmx, entropycalc array of different runs
//#pragma acc update device(maxclus,delmx,entropycalc,netmap1,netmap2)
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
//#pragma acc parallel loop private(ptr[0:vertices+1]) firstprivate(random[0:connectionNumber])
for(int rx=1;rx<=1;rx++)
int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;
int nodeA=0,nodeB=0;
int indx1=0;
int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;
double entropy = log(vertices);
//curandState_t state;
//curand_init(rand_seed*rx,0,0,&state);
for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;
//#pragma acc loop seq
for(int i=1;i<=vertices;i++)
if(i!=connectionNumber)
int algaRandomIndex = 0;
for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)
node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;
for(int nodeindex = 0; nodeindex<2*M; nodeindex++)
if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else
int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];
int clusmul[M];
int clusindex1 = 0;
for(int clusindex = 0; clusindex<M; clusindex++)
clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;
bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)
bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;
clusmulCheck = clusmulCheck && clusmulCheck1;
if(clusmulCheck)
nodeA = node[0];
nodeB = node[1];
for(int someK = 1; someK < M; someK++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;
else
int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)
if(clusmul[as]<low)
low = clusmul[as];
indx1 = as+1;
nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];
int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;
for(int ase = 1; ase < M; ase++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;
if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)
en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)
en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)
en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)
int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)
if(ptr[g]<ptr[h])
ptr[g] += ptr[h];
ptr[h] = g;
index = g;
else
ptr[h] += ptr[g];
ptr[g] = h;
index = h;
entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else
jump=big-bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
//#pragma acc atomic
entropycalc[i] += entropy;
bigtemp = big;
continue;
if(-ptr[index]>big) big = -ptr[index];
jump = big - bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
//#pragma acc atomic
if(entropy<0) entropy = 0;
entropycalc[i] += entropy;
bigtemp = big;
//vector<size_t>().swap(network);
//vector<int>().swap(network);
//network.clear();
//network.erase(network.begin(),network.end());
//cout<<network.capacity()<<endl;
network.shrink_to_fit();
//cout<<network.capacity()<<endl;
/*for(int i=0;i<connectionNumber;i++)
cout<<"maxclus: "<<maxclus[i]<<'t'<<"delmx: "<<delmx[i]<<'t'<<"entropy: "<<entropycalc[i]<<'t'<<endl;
*/
//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());
connectionNumber = con;
for(int i=1;i<=vertices;i++)
//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;
//fout1.close();
//fout2.close();
//fout3.close();
delete random;
random = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;
delete netmap1;
netmap1 = NULL;
delete netmap2;
netmap2 = NULL;
delete ptr;
ptr = NULL;
return 0;
c++ openmp
add a comment |Â
up vote
1
down vote
favorite
Here I uploaded a very basic Barabasi-Albert network generator and then a percolator which conducts percolation over the network following some rules. I have used openmp to parallelize the loops.
Here 3 arrays, maxclus
,delmx
and entropycalc
are shared between the parallel threads and netmap1
,netmap2
,ptr
and random
are made private to the threads. What it basically does is that, suppose you have a vector, and two arrays, then,
int* arrayresult = new int [N];
int* array;
#pragma omp parallel shared(arrayresult) private(array)
vector<int> someVec;
array = new int [N]
for(int k=0;k<somenum;k++) array[k] = 0;
#pragma omp for
for(int i=0;i<somenum;i++)
// do something with someVec;
// do something with array;
for(int j=0;j<somenum1;j++)
#pragma omp atomic
arrayresult[j] += someResult;
delete array;
Now this snippet describes the main gist of the code I am posting here. This shows a performance degradation proportional to the number of cores or threads being used. I am providing both the linear code and the parallel code.
How can I make the parallel one more efficient?
Parallel Code with OpenMP
//compile with icpc filename.cpp -o executable -O3 -std=c++14 -qopenmp
#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
#include<omp.h>
//#include "openacc_curand.h"
using namespace std;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network
// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)
if(ptr[i]<0) return i;
return ptr[i]=findroot(ptr,ptr[i]);
int main()
int seed,vertices,m,run,filenum,M;
//I am just going to set the initial value for your need
/*
cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 1;
//this sets up the connection and initializes the array;
int con = 0;
for(int i=1;i<seed;i++)
con = con + i;
con = con + (vertices-seed)*m;
//int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
//int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber
//for(int i=1;i<=con;i++)
//
// netmap1[i] = 0;
// netmap2[i] = 0;
//
connectionNumber = con;
srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;
//saved filename
string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";
filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
ofstream fout1,fout2,fout3;//,fout3;
//int* random = NULL;
//random = new int[connectionNumber+1];
double* maxclus = NULL;
maxclus = new double[vertices+1];
double* delmx = NULL;
delmx = new double[connectionNumber+1];
double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];
for(int i=0;i<=vertices;i++)
maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;
//for(int i=0;i<=connectionNumber;i++)
//
// random[i] = i;
//
//this is the pointer that needs to be made private for all the parallel loops
//int* ptr = new int[vertices+1];
//for(int i=0;i<vertices+1;i++) ptr[i]=0;
//the main program starts here
int* ptr; int* netmap1; int* netmap2; int* random;
int runcounter = 0;
#pragma omp parallel shared(con,runcounter,maxclus,delmx,entropycalc) private(ptr,netmap1,netmap2,random) firstprivate(connectionNumber)
ptr = new int[vertices+1];
netmap1 = new int[connectionNumber+1];
netmap2 = new int[connectionNumber+1];
random = new int[connectionNumber+1];
for(int l=0;l<=con;l++)
netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;
for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;
#pragma omp for schedule(static)
for(int i=1;i<=run;i++)
//#pragma omp critical
//cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
/*for(int l=0;l<=con;l++)
netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;
for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;*/
connectionNumber = 0;
//cout<<network.capacity()<<endl;
//seeds are connected to the network
for(int i=1;i<=seed;i++)
for(int j=1;j<=seed;j++)
if(j>i)
connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);
int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;
for(int i = seed+1;i<=vertices; i++)
countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;
for(int j = 1; ;j++)
concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
for(int n=0;n<m;n++)
if(ab[n] == network[u]) concheck = 1;
//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)
ab[countm] = network[u];
countm=countm+1;
connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];
network.push_back(i);
network.push_back(network[u]);
if(countm==m) break;
//the random list of connection are shuffled
random_shuffle(&random[1],&random[con]);
for(int rx=1;rx<=1;rx++)
int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;
int nodeA=0,nodeB=0;
int indx1=0;
int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;
double entropy = log(vertices);
for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;
for(int i=1;i<=vertices;i++)
if(i!=connectionNumber)
int algaRandomIndex = 0;
for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)
node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;
for(int nodeindex = 0; nodeindex<2*M; nodeindex++)
if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else
int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];
int clusmul[M];
int clusindex1 = 0;
for(int clusindex = 0; clusindex<M; clusindex++)
clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;
bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)
bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;
clusmulCheck = clusmulCheck && clusmulCheck1;
if(clusmulCheck)
nodeA = node[0];
nodeB = node[1];
for(int someK = 1; someK < M; someK++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;
else
int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)
if(clusmul[as]<low)
low = clusmul[as];
indx1 = as+1;
nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];
int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;
for(int ase = 1; ase < M; ase++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;
if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)
en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)
en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)
en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)
int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)
if(ptr[g]<ptr[h])
ptr[g] += ptr[h];
ptr[h] = g;
index = g;
else
ptr[h] += ptr[g];
ptr[g] = h;
index = h;
entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else
jump=big-bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;
continue;
if(-ptr[index]>big) big = -ptr[index];
jump = big - bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;
network.clear();
#pragma omp atomic
runcounter++;
int rem = (runcounter * 100/run) % 5;
if(rem == 0)
cout<<"Progress: "<<(double)runcounter*100/run<<"%"<<endl;
delete ptr;
delete netmap1;
delete netmap2;
delete random;
//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());
connectionNumber = con;
for(int i=1;i<=vertices;i++)
//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;
//fout1.close();
//fout2.close();
//fout3.close();
//delete random;
//random = NULL;
//delete netmap1;
//netmap1 = NULL;
//delete netmap2;
//netmap2 = NULL;
//delete ptr;
//ptr = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;
return 0;
Linear Code
#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
//#include<bits/stdc++.h>
//#include "openacc_curand.h"
using namespace std;
//vector<int> network;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network
// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)
if(ptr[i]<0) return i;
return ptr[i]=findroot(ptr,ptr[i]);
/*#pragma acc routine seq
int findroot(int ptr,int i)
//cao = 1;
int r,s;
r = s = i;
while (ptr[r]>=0)
ptr[s] = ptr[r];
s = r;
r = ptr[r];
return r;
*/
int main()
int seed,vertices,m,run,filenum,M;
//I am just going to set the initial value for your need
/* cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 10;
//this sets up the connection and initializes the array;
int con = 0;
for(int i=1;i<seed;i++)
con = con + i;
con = con + (vertices-seed)*m;
int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber
for(int i=1;i<=con;i++)
netmap1[i] = 0;
netmap2[i] = 0;
connectionNumber = con;
srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;
//saved filename
string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";
filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
ofstream fout1,fout2,fout3;//,fout3;
int* random = NULL;
random = new int[connectionNumber+1];
double* maxclus = NULL;
maxclus = new double[vertices+1];
double* delmx = NULL;
delmx = new double[connectionNumber+1];
double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];
for(int i=0;i<=vertices;i++)
maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;
for(int i=0;i<=connectionNumber;i++)
random[i] = i;
//this is the pointer that needs to be made private for all the parallel loops
int* ptr = new int[vertices+1];
for(int i=0;i<vertices+1;i++) ptr[i]=0;
//the main program starts here
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
for(int i=1;i<=run;i++)
cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
connectionNumber = 0;
//cout<<network.capacity()<<endl;
//seeds are connected to the network
for(int i=1;i<=seed;i++)
for(int j=1;j<=seed;j++)
if(j>i)
connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);
int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;
for(int i = seed+1;i<=vertices; i++)
countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;
for(int j = 1; ;j++)
concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
for(int n=0;n<m;n++)
if(ab[n] == network[u]) concheck = 1;
//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)
ab[countm] = network[u];
countm=countm+1;
connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];
network.push_back(i);
network.push_back(network[u]);
if(countm==m) break;
//the random list of connection are shuffled
random_shuffle(&random[1],&random[connectionNumber]);
double rand_seed = time(NULL);
//this is where the problem lies
//basically i want to make all the rx loops parallel in such a way that every parallel loop will have their own copy of ptr[ ] and random[ ] which they can modify themselves
// this whole part does the 'explosive percolation' and saves the data in maxclus, delmx, entropycalc array of different runs
//#pragma acc update device(maxclus,delmx,entropycalc,netmap1,netmap2)
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
//#pragma acc parallel loop private(ptr[0:vertices+1]) firstprivate(random[0:connectionNumber])
for(int rx=1;rx<=1;rx++)
int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;
int nodeA=0,nodeB=0;
int indx1=0;
int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;
double entropy = log(vertices);
//curandState_t state;
//curand_init(rand_seed*rx,0,0,&state);
for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;
//#pragma acc loop seq
for(int i=1;i<=vertices;i++)
if(i!=connectionNumber)
int algaRandomIndex = 0;
for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)
node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;
for(int nodeindex = 0; nodeindex<2*M; nodeindex++)
if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else
int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];
int clusmul[M];
int clusindex1 = 0;
for(int clusindex = 0; clusindex<M; clusindex++)
clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;
bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)
bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;
clusmulCheck = clusmulCheck && clusmulCheck1;
if(clusmulCheck)
nodeA = node[0];
nodeB = node[1];
for(int someK = 1; someK < M; someK++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;
else
int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)
if(clusmul[as]<low)
low = clusmul[as];
indx1 = as+1;
nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];
int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;
for(int ase = 1; ase < M; ase++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;
if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)
en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)
en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)
en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)
int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)
if(ptr[g]<ptr[h])
ptr[g] += ptr[h];
ptr[h] = g;
index = g;
else
ptr[h] += ptr[g];
ptr[g] = h;
index = h;
entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else
jump=big-bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
//#pragma acc atomic
entropycalc[i] += entropy;
bigtemp = big;
continue;
if(-ptr[index]>big) big = -ptr[index];
jump = big - bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
//#pragma acc atomic
if(entropy<0) entropy = 0;
entropycalc[i] += entropy;
bigtemp = big;
//vector<size_t>().swap(network);
//vector<int>().swap(network);
//network.clear();
//network.erase(network.begin(),network.end());
//cout<<network.capacity()<<endl;
network.shrink_to_fit();
//cout<<network.capacity()<<endl;
/*for(int i=0;i<connectionNumber;i++)
cout<<"maxclus: "<<maxclus[i]<<'t'<<"delmx: "<<delmx[i]<<'t'<<"entropy: "<<entropycalc[i]<<'t'<<endl;
*/
//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());
connectionNumber = con;
for(int i=1;i<=vertices;i++)
//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;
//fout1.close();
//fout2.close();
//fout3.close();
delete random;
random = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;
delete netmap1;
netmap1 = NULL;
delete netmap2;
netmap2 = NULL;
delete ptr;
ptr = NULL;
return 0;
c++ openmp
3
With a nearly 500 lines longmain()
function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.
â hoffmale
Jul 14 at 14:36
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Here I uploaded a very basic Barabasi-Albert network generator and then a percolator which conducts percolation over the network following some rules. I have used openmp to parallelize the loops.
Here 3 arrays, maxclus
,delmx
and entropycalc
are shared between the parallel threads and netmap1
,netmap2
,ptr
and random
are made private to the threads. What it basically does is that, suppose you have a vector, and two arrays, then,
int* arrayresult = new int [N];
int* array;
#pragma omp parallel shared(arrayresult) private(array)
vector<int> someVec;
array = new int [N]
for(int k=0;k<somenum;k++) array[k] = 0;
#pragma omp for
for(int i=0;i<somenum;i++)
// do something with someVec;
// do something with array;
for(int j=0;j<somenum1;j++)
#pragma omp atomic
arrayresult[j] += someResult;
delete array;
Now this snippet describes the main gist of the code I am posting here. This shows a performance degradation proportional to the number of cores or threads being used. I am providing both the linear code and the parallel code.
How can I make the parallel one more efficient?
Parallel Code with OpenMP
//compile with icpc filename.cpp -o executable -O3 -std=c++14 -qopenmp
#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
#include<omp.h>
//#include "openacc_curand.h"
using namespace std;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network
// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)
if(ptr[i]<0) return i;
return ptr[i]=findroot(ptr,ptr[i]);
int main()
int seed,vertices,m,run,filenum,M;
//I am just going to set the initial value for your need
/*
cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 1;
//this sets up the connection and initializes the array;
int con = 0;
for(int i=1;i<seed;i++)
con = con + i;
con = con + (vertices-seed)*m;
//int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
//int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber
//for(int i=1;i<=con;i++)
//
// netmap1[i] = 0;
// netmap2[i] = 0;
//
connectionNumber = con;
srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;
//saved filename
string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";
filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
ofstream fout1,fout2,fout3;//,fout3;
//int* random = NULL;
//random = new int[connectionNumber+1];
double* maxclus = NULL;
maxclus = new double[vertices+1];
double* delmx = NULL;
delmx = new double[connectionNumber+1];
double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];
for(int i=0;i<=vertices;i++)
maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;
//for(int i=0;i<=connectionNumber;i++)
//
// random[i] = i;
//
//this is the pointer that needs to be made private for all the parallel loops
//int* ptr = new int[vertices+1];
//for(int i=0;i<vertices+1;i++) ptr[i]=0;
//the main program starts here
int* ptr; int* netmap1; int* netmap2; int* random;
int runcounter = 0;
#pragma omp parallel shared(con,runcounter,maxclus,delmx,entropycalc) private(ptr,netmap1,netmap2,random) firstprivate(connectionNumber)
ptr = new int[vertices+1];
netmap1 = new int[connectionNumber+1];
netmap2 = new int[connectionNumber+1];
random = new int[connectionNumber+1];
for(int l=0;l<=con;l++)
netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;
for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;
#pragma omp for schedule(static)
for(int i=1;i<=run;i++)
//#pragma omp critical
//cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
/*for(int l=0;l<=con;l++)
netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;
for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;*/
connectionNumber = 0;
//cout<<network.capacity()<<endl;
//seeds are connected to the network
for(int i=1;i<=seed;i++)
for(int j=1;j<=seed;j++)
if(j>i)
connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);
int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;
for(int i = seed+1;i<=vertices; i++)
countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;
for(int j = 1; ;j++)
concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
for(int n=0;n<m;n++)
if(ab[n] == network[u]) concheck = 1;
//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)
ab[countm] = network[u];
countm=countm+1;
connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];
network.push_back(i);
network.push_back(network[u]);
if(countm==m) break;
//the random list of connection are shuffled
random_shuffle(&random[1],&random[con]);
for(int rx=1;rx<=1;rx++)
int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;
int nodeA=0,nodeB=0;
int indx1=0;
int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;
double entropy = log(vertices);
for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;
for(int i=1;i<=vertices;i++)
if(i!=connectionNumber)
int algaRandomIndex = 0;
for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)
node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;
for(int nodeindex = 0; nodeindex<2*M; nodeindex++)
if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else
int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];
int clusmul[M];
int clusindex1 = 0;
for(int clusindex = 0; clusindex<M; clusindex++)
clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;
bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)
bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;
clusmulCheck = clusmulCheck && clusmulCheck1;
if(clusmulCheck)
nodeA = node[0];
nodeB = node[1];
for(int someK = 1; someK < M; someK++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;
else
int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)
if(clusmul[as]<low)
low = clusmul[as];
indx1 = as+1;
nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];
int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;
for(int ase = 1; ase < M; ase++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;
if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)
en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)
en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)
en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)
int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)
if(ptr[g]<ptr[h])
ptr[g] += ptr[h];
ptr[h] = g;
index = g;
else
ptr[h] += ptr[g];
ptr[g] = h;
index = h;
entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else
jump=big-bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;
continue;
if(-ptr[index]>big) big = -ptr[index];
jump = big - bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;
network.clear();
#pragma omp atomic
runcounter++;
int rem = (runcounter * 100/run) % 5;
if(rem == 0)
cout<<"Progress: "<<(double)runcounter*100/run<<"%"<<endl;
delete ptr;
delete netmap1;
delete netmap2;
delete random;
//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());
connectionNumber = con;
for(int i=1;i<=vertices;i++)
//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;
//fout1.close();
//fout2.close();
//fout3.close();
//delete random;
//random = NULL;
//delete netmap1;
//netmap1 = NULL;
//delete netmap2;
//netmap2 = NULL;
//delete ptr;
//ptr = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;
return 0;
Linear Code
#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
//#include<bits/stdc++.h>
//#include "openacc_curand.h"
using namespace std;
//vector<int> network;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network
// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)
if(ptr[i]<0) return i;
return ptr[i]=findroot(ptr,ptr[i]);
/*#pragma acc routine seq
int findroot(int ptr,int i)
//cao = 1;
int r,s;
r = s = i;
while (ptr[r]>=0)
ptr[s] = ptr[r];
s = r;
r = ptr[r];
return r;
*/
int main()
int seed,vertices,m,run,filenum,M;
//I am just going to set the initial value for your need
/* cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 10;
//this sets up the connection and initializes the array;
int con = 0;
for(int i=1;i<seed;i++)
con = con + i;
con = con + (vertices-seed)*m;
int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber
for(int i=1;i<=con;i++)
netmap1[i] = 0;
netmap2[i] = 0;
connectionNumber = con;
srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;
//saved filename
string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";
filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
ofstream fout1,fout2,fout3;//,fout3;
int* random = NULL;
random = new int[connectionNumber+1];
double* maxclus = NULL;
maxclus = new double[vertices+1];
double* delmx = NULL;
delmx = new double[connectionNumber+1];
double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];
for(int i=0;i<=vertices;i++)
maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;
for(int i=0;i<=connectionNumber;i++)
random[i] = i;
//this is the pointer that needs to be made private for all the parallel loops
int* ptr = new int[vertices+1];
for(int i=0;i<vertices+1;i++) ptr[i]=0;
//the main program starts here
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
for(int i=1;i<=run;i++)
cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
connectionNumber = 0;
//cout<<network.capacity()<<endl;
//seeds are connected to the network
for(int i=1;i<=seed;i++)
for(int j=1;j<=seed;j++)
if(j>i)
connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);
int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;
for(int i = seed+1;i<=vertices; i++)
countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;
for(int j = 1; ;j++)
concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
for(int n=0;n<m;n++)
if(ab[n] == network[u]) concheck = 1;
//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)
ab[countm] = network[u];
countm=countm+1;
connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];
network.push_back(i);
network.push_back(network[u]);
if(countm==m) break;
//the random list of connection are shuffled
random_shuffle(&random[1],&random[connectionNumber]);
double rand_seed = time(NULL);
//this is where the problem lies
//basically i want to make all the rx loops parallel in such a way that every parallel loop will have their own copy of ptr[ ] and random[ ] which they can modify themselves
// this whole part does the 'explosive percolation' and saves the data in maxclus, delmx, entropycalc array of different runs
//#pragma acc update device(maxclus,delmx,entropycalc,netmap1,netmap2)
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
//#pragma acc parallel loop private(ptr[0:vertices+1]) firstprivate(random[0:connectionNumber])
for(int rx=1;rx<=1;rx++)
int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;
int nodeA=0,nodeB=0;
int indx1=0;
int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;
double entropy = log(vertices);
//curandState_t state;
//curand_init(rand_seed*rx,0,0,&state);
for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;
//#pragma acc loop seq
for(int i=1;i<=vertices;i++)
if(i!=connectionNumber)
int algaRandomIndex = 0;
for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)
node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;
for(int nodeindex = 0; nodeindex<2*M; nodeindex++)
if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else
int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];
int clusmul[M];
int clusindex1 = 0;
for(int clusindex = 0; clusindex<M; clusindex++)
clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;
bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)
bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;
clusmulCheck = clusmulCheck && clusmulCheck1;
if(clusmulCheck)
nodeA = node[0];
nodeB = node[1];
for(int someK = 1; someK < M; someK++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;
else
int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)
if(clusmul[as]<low)
low = clusmul[as];
indx1 = as+1;
nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];
int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;
for(int ase = 1; ase < M; ase++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;
if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)
en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)
en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)
en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)
int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)
if(ptr[g]<ptr[h])
ptr[g] += ptr[h];
ptr[h] = g;
index = g;
else
ptr[h] += ptr[g];
ptr[g] = h;
index = h;
entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else
jump=big-bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
//#pragma acc atomic
entropycalc[i] += entropy;
bigtemp = big;
continue;
if(-ptr[index]>big) big = -ptr[index];
jump = big - bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
//#pragma acc atomic
if(entropy<0) entropy = 0;
entropycalc[i] += entropy;
bigtemp = big;
//vector<size_t>().swap(network);
//vector<int>().swap(network);
//network.clear();
//network.erase(network.begin(),network.end());
//cout<<network.capacity()<<endl;
network.shrink_to_fit();
//cout<<network.capacity()<<endl;
/*for(int i=0;i<connectionNumber;i++)
cout<<"maxclus: "<<maxclus[i]<<'t'<<"delmx: "<<delmx[i]<<'t'<<"entropy: "<<entropycalc[i]<<'t'<<endl;
*/
//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());
connectionNumber = con;
for(int i=1;i<=vertices;i++)
//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;
//fout1.close();
//fout2.close();
//fout3.close();
delete random;
random = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;
delete netmap1;
netmap1 = NULL;
delete netmap2;
netmap2 = NULL;
delete ptr;
ptr = NULL;
return 0;
c++ openmp
Here I uploaded a very basic Barabasi-Albert network generator and then a percolator which conducts percolation over the network following some rules. I have used openmp to parallelize the loops.
Here 3 arrays, maxclus
,delmx
and entropycalc
are shared between the parallel threads and netmap1
,netmap2
,ptr
and random
are made private to the threads. What it basically does is that, suppose you have a vector, and two arrays, then,
int* arrayresult = new int [N];
int* array;
#pragma omp parallel shared(arrayresult) private(array)
vector<int> someVec;
array = new int [N]
for(int k=0;k<somenum;k++) array[k] = 0;
#pragma omp for
for(int i=0;i<somenum;i++)
// do something with someVec;
// do something with array;
for(int j=0;j<somenum1;j++)
#pragma omp atomic
arrayresult[j] += someResult;
delete array;
Now this snippet describes the main gist of the code I am posting here. This shows a performance degradation proportional to the number of cores or threads being used. I am providing both the linear code and the parallel code.
How can I make the parallel one more efficient?
Parallel Code with OpenMP
//compile with icpc filename.cpp -o executable -O3 -std=c++14 -qopenmp
#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
#include<omp.h>
//#include "openacc_curand.h"
using namespace std;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network
// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)
if(ptr[i]<0) return i;
return ptr[i]=findroot(ptr,ptr[i]);
int main()
int seed,vertices,m,run,filenum,M;
//I am just going to set the initial value for your need
/*
cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 1;
//this sets up the connection and initializes the array;
int con = 0;
for(int i=1;i<seed;i++)
con = con + i;
con = con + (vertices-seed)*m;
//int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
//int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber
//for(int i=1;i<=con;i++)
//
// netmap1[i] = 0;
// netmap2[i] = 0;
//
connectionNumber = con;
srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;
//saved filename
string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";
filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
ofstream fout1,fout2,fout3;//,fout3;
//int* random = NULL;
//random = new int[connectionNumber+1];
double* maxclus = NULL;
maxclus = new double[vertices+1];
double* delmx = NULL;
delmx = new double[connectionNumber+1];
double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];
for(int i=0;i<=vertices;i++)
maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;
//for(int i=0;i<=connectionNumber;i++)
//
// random[i] = i;
//
//this is the pointer that needs to be made private for all the parallel loops
//int* ptr = new int[vertices+1];
//for(int i=0;i<vertices+1;i++) ptr[i]=0;
//the main program starts here
int* ptr; int* netmap1; int* netmap2; int* random;
int runcounter = 0;
#pragma omp parallel shared(con,runcounter,maxclus,delmx,entropycalc) private(ptr,netmap1,netmap2,random) firstprivate(connectionNumber)
ptr = new int[vertices+1];
netmap1 = new int[connectionNumber+1];
netmap2 = new int[connectionNumber+1];
random = new int[connectionNumber+1];
for(int l=0;l<=con;l++)
netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;
for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;
#pragma omp for schedule(static)
for(int i=1;i<=run;i++)
//#pragma omp critical
//cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
/*for(int l=0;l<=con;l++)
netmap1[l] = 0;
netmap2[l] = 0;
random[l] = l;
for(int l=0;l<=vertices;l++)
ptr[l] = EMPTY;*/
connectionNumber = 0;
//cout<<network.capacity()<<endl;
//seeds are connected to the network
for(int i=1;i<=seed;i++)
for(int j=1;j<=seed;j++)
if(j>i)
connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);
int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;
for(int i = seed+1;i<=vertices; i++)
countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;
for(int j = 1; ;j++)
concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
for(int n=0;n<m;n++)
if(ab[n] == network[u]) concheck = 1;
//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)
ab[countm] = network[u];
countm=countm+1;
connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];
network.push_back(i);
network.push_back(network[u]);
if(countm==m) break;
//the random list of connection are shuffled
random_shuffle(&random[1],&random[con]);
for(int rx=1;rx<=1;rx++)
int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;
int nodeA=0,nodeB=0;
int indx1=0;
int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;
double entropy = log(vertices);
for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;
for(int i=1;i<=vertices;i++)
if(i!=connectionNumber)
int algaRandomIndex = 0;
for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)
node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;
for(int nodeindex = 0; nodeindex<2*M; nodeindex++)
if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else
int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];
int clusmul[M];
int clusindex1 = 0;
for(int clusindex = 0; clusindex<M; clusindex++)
clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;
bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)
bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;
clusmulCheck = clusmulCheck && clusmulCheck1;
if(clusmulCheck)
nodeA = node[0];
nodeB = node[1];
for(int someK = 1; someK < M; someK++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;
else
int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)
if(clusmul[as]<low)
low = clusmul[as];
indx1 = as+1;
nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];
int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;
for(int ase = 1; ase < M; ase++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;
if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)
en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)
en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)
en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)
int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)
if(ptr[g]<ptr[h])
ptr[g] += ptr[h];
ptr[h] = g;
index = g;
else
ptr[h] += ptr[g];
ptr[g] = h;
index = h;
entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else
jump=big-bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;
continue;
if(-ptr[index]>big) big = -ptr[index];
jump = big - bigtemp;
#pragma omp atomic
maxclus[i] += big;
#pragma omp atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
#pragma omp atomic
entropycalc[i] += entropy;
bigtemp = big;
network.clear();
#pragma omp atomic
runcounter++;
int rem = (runcounter * 100/run) % 5;
if(rem == 0)
cout<<"Progress: "<<(double)runcounter*100/run<<"%"<<endl;
delete ptr;
delete netmap1;
delete netmap2;
delete random;
//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());
connectionNumber = con;
for(int i=1;i<=vertices;i++)
//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;
//fout1.close();
//fout2.close();
//fout3.close();
//delete random;
//random = NULL;
//delete netmap1;
//netmap1 = NULL;
//delete netmap2;
//netmap2 = NULL;
//delete ptr;
//ptr = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;
return 0;
Linear Code
#include<iostream>
#include<vector>
#include<string>
#include<cstdlib>
#include<iomanip>
#include<ctime>
#include<cmath>
#include<random>
#include<fstream>
#include<algorithm>
//#include<bits/stdc++.h>
//#include "openacc_curand.h"
using namespace std;
//vector<int> network;
int EMPTY = 0;
int connectionNumber = 0; // it indexes the connection between different nodes of the network
// this function does the union find work for the percolation
//#pragma acc routine seq
int findroot(int ptr,int i)
if(ptr[i]<0) return i;
return ptr[i]=findroot(ptr,ptr[i]);
/*#pragma acc routine seq
int findroot(int ptr,int i)
//cao = 1;
int r,s;
r = s = i;
while (ptr[r]>=0)
ptr[s] = ptr[r];
s = r;
r = ptr[r];
return r;
*/
int main()
int seed,vertices,m,run,filenum,M;
//I am just going to set the initial value for your need
/* cout<<"enter seed size: ";
cin>>seed;
cout<<endl;
cout<<"enter vertice number: ";
cin>>vertices;
cout<<endl;
cout<<"order number: ";
cin>>m;
cout<<endl;
cout<<"order of Explosive Percolation: ";
cin>>M;
cout<<endl;
cout<<"enter ensemble number: ";
cin>>run;
cout<<endl;
cout<<"enter filenumber: ";
cin>>filenum;
cout<<endl;
*/
seed = 6;
vertices = 500000;
m = 5;
M = 12;
run = 50;
filenum = 10;
//this sets up the connection and initializes the array;
int con = 0;
for(int i=1;i<seed;i++)
con = con + i;
con = con + (vertices-seed)*m;
int* netmap1 = new int[con+1]; //node 1 that is connected to a certain connectionNumber
int* netmap2 = new int[con+1]; //node 2 that is connected to a certain connectionNumber
for(int i=1;i<=con;i++)
netmap1[i] = 0;
netmap2[i] = 0;
connectionNumber = con;
srand(time(NULL));
int A,B,C;
A = vertices;
B = run;
C = filenum;
//saved filename
string filename1;
filename1 = "maxclus_";
string filename2;
filename2 = "delmx_";
string filename3;
filename3 = "entropy_";
filename1 = filename1+to_string(A)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(B)+"ens"+to_string(C)+".dat";
filename2 = filename2+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
filename3 = filename3+to_string(vertices)+"node_"+to_string(m)+"m_"+to_string(M)+"M_"+to_string(run)+"ens"+to_string(filenum)+".dat";
ofstream fout1,fout2,fout3;//,fout3;
int* random = NULL;
random = new int[connectionNumber+1];
double* maxclus = NULL;
maxclus = new double[vertices+1];
double* delmx = NULL;
delmx = new double[connectionNumber+1];
double* entropycalc = NULL;
entropycalc = new double[connectionNumber+1];
for(int i=0;i<=vertices;i++)
maxclus[i]=0;
delmx[i]=0;
entropycalc[i]=0;
for(int i=0;i<=connectionNumber;i++)
random[i] = i;
//this is the pointer that needs to be made private for all the parallel loops
int* ptr = new int[vertices+1];
for(int i=0;i<vertices+1;i++) ptr[i]=0;
//the main program starts here
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
for(int i=1;i<=run;i++)
cout<<"run : "<<i<<endl;
//vector<size_t> network;
vector<int> network;
connectionNumber = 0;
//cout<<network.capacity()<<endl;
//seeds are connected to the network
for(int i=1;i<=seed;i++)
for(int j=1;j<=seed;j++)
if(j>i)
connectionNumber=connectionNumber + 1;
netmap1[connectionNumber]=i; //connections are addressed
netmap2[connectionNumber]=j;
network.push_back(i); // the vector is updated for making connection
network.push_back(j);
int concheck = 0;
int ab[m]; //this array checks if a node is chosen twice
int countm = 0;
for(int i = seed+1;i<=vertices; i++)
countm = 0;
for(int k=0;k<m;k++) ab[k] = 0;
for(int j = 1; ;j++)
concheck = 0;
int N1=network.size() ;
int M1=0;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
for(int n=0;n<m;n++)
if(ab[n] == network[u]) concheck = 1;
//if condition is fulfilled the connection are given to the nodes
//the data is saved in the arrays of the connection
if(concheck == 0 && network[u]!=i)
ab[countm] = network[u];
countm=countm+1;
connectionNumber=connectionNumber+1;
netmap1[connectionNumber] = i;
netmap2[connectionNumber] = network[u];
network.push_back(i);
network.push_back(network[u]);
if(countm==m) break;
//the random list of connection are shuffled
random_shuffle(&random[1],&random[connectionNumber]);
double rand_seed = time(NULL);
//this is where the problem lies
//basically i want to make all the rx loops parallel in such a way that every parallel loop will have their own copy of ptr[ ] and random[ ] which they can modify themselves
// this whole part does the 'explosive percolation' and saves the data in maxclus, delmx, entropycalc array of different runs
//#pragma acc update device(maxclus,delmx,entropycalc,netmap1,netmap2)
//#pragma acc data copy(maxclus[0:connectionNumber],delmx[0:connectionNumber],entropycalc[0:connectionNumber]), copyin(netmap1[0:connectionNumber],netmap2[0:connectionNumber])
//#pragma acc parallel loop private(ptr[0:vertices+1]) firstprivate(random[0:connectionNumber])
for(int rx=1;rx<=1;rx++)
int index=0,big=0,bigtemp=0,jump=0,en1=0,en2=0;
int nodeA=0,nodeB=0;
int indx1=0;
int node[2*M+1];// = 0;
int clus[2*M+1];// = 0;
double entropy = log(vertices);
//curandState_t state;
//curand_init(rand_seed*rx,0,0,&state);
for(int i=0;i<=vertices;i++) ptr[i] = EMPTY;
//#pragma acc loop seq
for(int i=1;i<=vertices;i++)
if(i!=connectionNumber)
int algaRandomIndex = 0;
for(int nodeindex = 0; nodeindex<2*M; nodeindex+=2)
node[nodeindex] = netmap1[random[i + algaRandomIndex]];
node[nodeindex + 1] = netmap2[random[i + algaRandomIndex]];
algaRandomIndex++;
for(int nodeindex = 0; nodeindex<2*M; nodeindex++)
if(ptr[node[nodeindex]]==EMPTY) clus[nodeindex] = 1;
else
int x = findroot(ptr,node[nodeindex]);
clus[nodeindex] = -ptr[x];
int clusmul[M];
int clusindex1 = 0;
for(int clusindex = 0; clusindex<M; clusindex++)
clusmul[clusindex] = clus[clusindex1]*clus[clusindex1+1];
clusindex1 += 2;
bool clusmulCheck = true;
for(int ase = 0; ase < M; ase++)
bool clusmulCheck1 = true;
if(clusmul[ase] == 1) clusmulCheck1 = true;
else clusmulCheck1 = false;
clusmulCheck = clusmulCheck && clusmulCheck1;
if(clusmulCheck)
nodeA = node[0];
nodeB = node[1];
for(int someK = 1; someK < M; someK++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+someK];
random[i+someK] = temp;
else
int low = clusmul[0];
indx1 = 1;
for(int as=0;as<11;as++)
if(clusmul[as]<low)
low = clusmul[as];
indx1 = as+1;
nodeA = node[2*indx1 - 2];
nodeB = node[2*indx1 - 1];
int temp = random[i+(indx1-1)];
random[i+(indx1-1)] = random[i];
random[i] = temp;
for(int ase = 1; ase < M; ase++)
int N1=connectionNumber;
int M1=i+M;
int u = M1 + rand()/(RAND_MAX/(N1-M1+1) + 1);
int temp = random[u];
random[u] = random[i+ ase];
random[i+ ase] = temp;
if(ptr[nodeA]==EMPTY && ptr[nodeB]==EMPTY)
en1=1;
en2=1;
ptr[nodeA] = -2;
ptr[nodeB] = nodeA;
index = nodeA;
entropy = (double)(entropy-(-2.0/vertices*log(1.0/vertices))+(-2.0/vertices*log(2.0/vertices)));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]==EMPTY && ptr[nodeB]!=EMPTY)
en1=1;
int e = findroot(ptr,nodeB);
en2 = -(ptr[e]);
ptr[nodeA] = e;
ptr[e] += -1;
index = e;
entropy = entropy-(-(double)1.0/vertices*log(1.0/(double)vertices))-(-(double)en2/vertices*log((double)en2/vertices))+(-( double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]==EMPTY)
en2 = 1;
int f = findroot(ptr,nodeA);
en1 = -(ptr[f]);
ptr[nodeB] = f;
ptr[f] += -1;
index = f;
entropy = entropy-(-(double)1.0/(double)vertices*log(1.0/(double)vertices))-(-(double)en1/(double)vertices*log((double)en1/vertices))+(-(double)(-ptr[index])/vertices*log((-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else if(ptr[nodeA]!=EMPTY && ptr[nodeB]!=EMPTY)
int g,h;
g = findroot(ptr,nodeA);
en1 = -(ptr[g]);
h = findroot(ptr,nodeB);
en2 = -(ptr[h]);
if(g!=h)
if(ptr[g]<ptr[h])
ptr[g] += ptr[h];
ptr[h] = g;
index = g;
else
ptr[h] += ptr[g];
ptr[g] = h;
index = h;
entropy = entropy-(-(double)en1/(double)vertices*log((double)en1/(double)vertices))-(-(double)en2/vertices*log((double)en2/(double)vertices))+(-(double)(-ptr[index])/vertices*log((double)(-ptr[index])/(double)vertices));
if(entropy<0) entropy = 0;
else
jump=big-bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
if(entropy<0) entropy = 0;
//#pragma acc atomic
entropycalc[i] += entropy;
bigtemp = big;
continue;
if(-ptr[index]>big) big = -ptr[index];
jump = big - bigtemp;
//#pragma acc atomic
maxclus[i] += big;
//#pragma acc atomic
delmx[i] += jump;
//#pragma acc atomic
if(entropy<0) entropy = 0;
entropycalc[i] += entropy;
bigtemp = big;
//vector<size_t>().swap(network);
//vector<int>().swap(network);
//network.clear();
//network.erase(network.begin(),network.end());
//cout<<network.capacity()<<endl;
network.shrink_to_fit();
//cout<<network.capacity()<<endl;
/*for(int i=0;i<connectionNumber;i++)
cout<<"maxclus: "<<maxclus[i]<<'t'<<"delmx: "<<delmx[i]<<'t'<<"entropy: "<<entropycalc[i]<<'t'<<endl;
*/
//fout1.open(filename1.c_str());
//fout2.open(filename2.c_str());
//fout3.open(filename3.c_str());
connectionNumber = con;
for(int i=1;i<=vertices;i++)
//fout1<<(double)i/vertices<<'t'<<(double)maxclus[i]/vertices/run<<endl;
//fout2<<(double)i/vertices<<'t'<<(double)delmx[i]/run<<endl;
//fout3<<(double)i/vertices<<'t'<<(double)entropycalc[i]/run<<endl;
//fout1.close();
//fout2.close();
//fout3.close();
delete random;
random = NULL;
delete maxclus;
maxclus = NULL;
delete delmx;
delmx = NULL;
delete entropycalc;
entropycalc = NULL;
delete netmap1;
netmap1 = NULL;
delete netmap2;
netmap2 = NULL;
delete ptr;
ptr = NULL;
return 0;
c++ openmp
edited Jul 14 at 13:59
Stephen Rauch
3,49951430
3,49951430
asked Jul 14 at 7:57
Digonto Islam
354
354
3
With a nearly 500 lines longmain()
function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.
â hoffmale
Jul 14 at 14:36
add a comment |Â
3
With a nearly 500 lines longmain()
function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.
â hoffmale
Jul 14 at 14:36
3
3
With a nearly 500 lines long
main()
function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.â hoffmale
Jul 14 at 14:36
With a nearly 500 lines long
main()
function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.â hoffmale
Jul 14 at 14:36
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
3
down vote
accepted
I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand()
inside the parallelized loop.
Each call to rand()
not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand
won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.
That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am usingmaxclus
,delmx
andentropycalc
as the shared arrays andptr
,random
,netmap1
,netmap2
and a vectornetwork
as private, is that making any performance degradation?
â Digonto Islam
Jul 17 at 9:54
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand()
inside the parallelized loop.
Each call to rand()
not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand
won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.
That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am usingmaxclus
,delmx
andentropycalc
as the shared arrays andptr
,random
,netmap1
,netmap2
and a vectornetwork
as private, is that making any performance degradation?
â Digonto Islam
Jul 17 at 9:54
add a comment |Â
up vote
3
down vote
accepted
I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand()
inside the parallelized loop.
Each call to rand()
not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand
won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.
That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am usingmaxclus
,delmx
andentropycalc
as the shared arrays andptr
,random
,netmap1
,netmap2
and a vectornetwork
as private, is that making any performance degradation?
â Digonto Islam
Jul 17 at 9:54
add a comment |Â
up vote
3
down vote
accepted
up vote
3
down vote
accepted
I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand()
inside the parallelized loop.
Each call to rand()
not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand
won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.
I haven't examined the code in great detail yet, but the first point that nearly jumped out was the use of rand()
inside the parallelized loop.
Each call to rand()
not only retrieves data, but also modifies a seed that's normally shared between all threads, so access to that seed is serialized. In other words, the calls to rand
won't run in parallel. My immediate advice would be to switch to the new random number generation classes that were added in C++11. They're kind of clumsy to use initially, but since each generator is an object, it's trivial to have a private instance for each thread, so they can all run in parallel.
answered Jul 15 at 5:14
Jerry Coffin
27.3k360123
27.3k360123
That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am usingmaxclus
,delmx
andentropycalc
as the shared arrays andptr
,random
,netmap1
,netmap2
and a vectornetwork
as private, is that making any performance degradation?
â Digonto Islam
Jul 17 at 9:54
add a comment |Â
That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am usingmaxclus
,delmx
andentropycalc
as the shared arrays andptr
,random
,netmap1
,netmap2
and a vectornetwork
as private, is that making any performance degradation?
â Digonto Islam
Jul 17 at 9:54
That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am using
maxclus
,delmx
and entropycalc
as the shared arrays and ptr
, random
, netmap1
, netmap2
and a vector network
as private, is that making any performance degradation?â Digonto Islam
Jul 17 at 9:54
That really helped to get a performance boost. But is there any other ways to make it more efficient, like I am using
maxclus
,delmx
and entropycalc
as the shared arrays and ptr
, random
, netmap1
, netmap2
and a vector network
as private, is that making any performance degradation?â Digonto Islam
Jul 17 at 9:54
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f198479%2fba-network-generator-and-percolator-with-openmp-and-also-linear-code%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
3
With a nearly 500 lines long
main()
function this code is hard to reason about. Maybe split it into multiple function? I've been staring at it for like 15 minutes, and still can't get a gist of what is actually being done. Doing so might even help the optimizer, as it then has more knowledge about the lifetime and scope of variables.â hoffmale
Jul 14 at 14:36