Measuring the clock rate of an Intel CPU in C++

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
9
down vote

favorite
1












I created a short program to directly measure the clock rate of an Intel CPU. It runs, sleeps for a duration specified by the user, and then measures the number of clock cycles and the amount of time that elapsed while it was asleep, then divides the elapsed clock cycles by the elapsed time to get the frequency of the CPU.



  • Is the below code up to stuff in terms of style and readability?

  • Is there any reason the strategy used and described below won't produce accurate results?

Any feedback is appreciated! If you wish to test out the code, you can download it with git clone https://github.com/firetotherain/CPUHertz.git



#include <thread>
#include <chrono>
#include <cstdio>

//Uses assembly command to get the current value of the cycle "counter"
uint64_t get_cycles()
lo;


//Defines unit of time to measure seconds
typedef std::chrono::duration<double, std::ratio<1,1>> seconds_t;

//In main(int argc, char** argv) this global variable is initialized to argv[0]
const char* program_name;

//Initialized at the start of the program; gets the current time
auto start_time = std::chrono::high_resolution_clock::now();


//Returns the time since the start of the program, measured in seconds
//Has accuracy identical to that of std::chrono::high_resolution_clock
double age()

return seconds_t(std::chrono::high_resolution_clock::now() - start_time).count();


//Prints the program usage
void PrintUsage()

printf("Usage:n");
printf("%s [measurment duration]n", program_name);

int main(int argc, char** argv)

using namespace std::chrono_literals;
program_name = argv[0];
int sleeptime = 100;
switch(argc)
case 1:
sleeptime = 100;
break;
case 2:
try

sleeptime = std::stoi(argv[1]);

catch(...)

printf("Error: argument was not an integer.");
PrintUsage();
return 1;

break;
default:
printf("Error: too many arguments.");
PrintUsage();
return 1;
break;

uint64_t cycles_start = get_cycles();
double time_start = age();
std::this_thread::sleep_for(sleeptime * 1ms);
uint64_t elapsed_cycles = get_cycles() - cycles_start;
double elapsed_time = age() - time_start;
printf("CPU MHz: %.3fn", elapsed_cycles / elapsed_time / 1000000.0);







share|improve this question



























    up vote
    9
    down vote

    favorite
    1












    I created a short program to directly measure the clock rate of an Intel CPU. It runs, sleeps for a duration specified by the user, and then measures the number of clock cycles and the amount of time that elapsed while it was asleep, then divides the elapsed clock cycles by the elapsed time to get the frequency of the CPU.



    • Is the below code up to stuff in terms of style and readability?

    • Is there any reason the strategy used and described below won't produce accurate results?

    Any feedback is appreciated! If you wish to test out the code, you can download it with git clone https://github.com/firetotherain/CPUHertz.git



    #include <thread>
    #include <chrono>
    #include <cstdio>

    //Uses assembly command to get the current value of the cycle "counter"
    uint64_t get_cycles()
    lo;


    //Defines unit of time to measure seconds
    typedef std::chrono::duration<double, std::ratio<1,1>> seconds_t;

    //In main(int argc, char** argv) this global variable is initialized to argv[0]
    const char* program_name;

    //Initialized at the start of the program; gets the current time
    auto start_time = std::chrono::high_resolution_clock::now();


    //Returns the time since the start of the program, measured in seconds
    //Has accuracy identical to that of std::chrono::high_resolution_clock
    double age()

    return seconds_t(std::chrono::high_resolution_clock::now() - start_time).count();


    //Prints the program usage
    void PrintUsage()

    printf("Usage:n");
    printf("%s [measurment duration]n", program_name);

    int main(int argc, char** argv)

    using namespace std::chrono_literals;
    program_name = argv[0];
    int sleeptime = 100;
    switch(argc)
    case 1:
    sleeptime = 100;
    break;
    case 2:
    try

    sleeptime = std::stoi(argv[1]);

    catch(...)

    printf("Error: argument was not an integer.");
    PrintUsage();
    return 1;

    break;
    default:
    printf("Error: too many arguments.");
    PrintUsage();
    return 1;
    break;

    uint64_t cycles_start = get_cycles();
    double time_start = age();
    std::this_thread::sleep_for(sleeptime * 1ms);
    uint64_t elapsed_cycles = get_cycles() - cycles_start;
    double elapsed_time = age() - time_start;
    printf("CPU MHz: %.3fn", elapsed_cycles / elapsed_time / 1000000.0);







    share|improve this question























      up vote
      9
      down vote

      favorite
      1









      up vote
      9
      down vote

      favorite
      1






      1





      I created a short program to directly measure the clock rate of an Intel CPU. It runs, sleeps for a duration specified by the user, and then measures the number of clock cycles and the amount of time that elapsed while it was asleep, then divides the elapsed clock cycles by the elapsed time to get the frequency of the CPU.



      • Is the below code up to stuff in terms of style and readability?

      • Is there any reason the strategy used and described below won't produce accurate results?

      Any feedback is appreciated! If you wish to test out the code, you can download it with git clone https://github.com/firetotherain/CPUHertz.git



      #include <thread>
      #include <chrono>
      #include <cstdio>

      //Uses assembly command to get the current value of the cycle "counter"
      uint64_t get_cycles()
      lo;


      //Defines unit of time to measure seconds
      typedef std::chrono::duration<double, std::ratio<1,1>> seconds_t;

      //In main(int argc, char** argv) this global variable is initialized to argv[0]
      const char* program_name;

      //Initialized at the start of the program; gets the current time
      auto start_time = std::chrono::high_resolution_clock::now();


      //Returns the time since the start of the program, measured in seconds
      //Has accuracy identical to that of std::chrono::high_resolution_clock
      double age()

      return seconds_t(std::chrono::high_resolution_clock::now() - start_time).count();


      //Prints the program usage
      void PrintUsage()

      printf("Usage:n");
      printf("%s [measurment duration]n", program_name);

      int main(int argc, char** argv)

      using namespace std::chrono_literals;
      program_name = argv[0];
      int sleeptime = 100;
      switch(argc)
      case 1:
      sleeptime = 100;
      break;
      case 2:
      try

      sleeptime = std::stoi(argv[1]);

      catch(...)

      printf("Error: argument was not an integer.");
      PrintUsage();
      return 1;

      break;
      default:
      printf("Error: too many arguments.");
      PrintUsage();
      return 1;
      break;

      uint64_t cycles_start = get_cycles();
      double time_start = age();
      std::this_thread::sleep_for(sleeptime * 1ms);
      uint64_t elapsed_cycles = get_cycles() - cycles_start;
      double elapsed_time = age() - time_start;
      printf("CPU MHz: %.3fn", elapsed_cycles / elapsed_time / 1000000.0);







      share|improve this question













      I created a short program to directly measure the clock rate of an Intel CPU. It runs, sleeps for a duration specified by the user, and then measures the number of clock cycles and the amount of time that elapsed while it was asleep, then divides the elapsed clock cycles by the elapsed time to get the frequency of the CPU.



      • Is the below code up to stuff in terms of style and readability?

      • Is there any reason the strategy used and described below won't produce accurate results?

      Any feedback is appreciated! If you wish to test out the code, you can download it with git clone https://github.com/firetotherain/CPUHertz.git



      #include <thread>
      #include <chrono>
      #include <cstdio>

      //Uses assembly command to get the current value of the cycle "counter"
      uint64_t get_cycles()
      lo;


      //Defines unit of time to measure seconds
      typedef std::chrono::duration<double, std::ratio<1,1>> seconds_t;

      //In main(int argc, char** argv) this global variable is initialized to argv[0]
      const char* program_name;

      //Initialized at the start of the program; gets the current time
      auto start_time = std::chrono::high_resolution_clock::now();


      //Returns the time since the start of the program, measured in seconds
      //Has accuracy identical to that of std::chrono::high_resolution_clock
      double age()

      return seconds_t(std::chrono::high_resolution_clock::now() - start_time).count();


      //Prints the program usage
      void PrintUsage()

      printf("Usage:n");
      printf("%s [measurment duration]n", program_name);

      int main(int argc, char** argv)

      using namespace std::chrono_literals;
      program_name = argv[0];
      int sleeptime = 100;
      switch(argc)
      case 1:
      sleeptime = 100;
      break;
      case 2:
      try

      sleeptime = std::stoi(argv[1]);

      catch(...)

      printf("Error: argument was not an integer.");
      PrintUsage();
      return 1;

      break;
      default:
      printf("Error: too many arguments.");
      PrintUsage();
      return 1;
      break;

      uint64_t cycles_start = get_cycles();
      double time_start = age();
      std::this_thread::sleep_for(sleeptime * 1ms);
      uint64_t elapsed_cycles = get_cycles() - cycles_start;
      double elapsed_time = age() - time_start;
      printf("CPU MHz: %.3fn", elapsed_cycles / elapsed_time / 1000000.0);









      share|improve this question












      share|improve this question




      share|improve this question








      edited Mar 1 at 1:02









      200_success

      123k14142399




      123k14142399









      asked Mar 1 at 0:53









      Antonio Perez

      461




      461




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          11
          down vote













          There are a number of things that you might employ to improve your program.



          Accuracy




          Is there any reason the strategy used and described below won't produce accurate results?




          Yes! There are many reasons, which are each enumerated separately in items below, followed by more general information about code style and structure.



          Understand System Management Mode



          The code currently does not account for the possibilty of the processor entering System Management Mode (SMM). One simple way of putting the machine into SMM is to have it go into standby. My laptop is configured to go into standby when I close the lid and come back out when I open it again, so when I did that when running this program, I got this result:




          CPU MHz: 2365092677585.790




          While it would be lovely to actually have a processor that fast, my laptop does not, so this result is incorrect.



          Understand sleep_for



          What std::this_thread::sleep_for(delay) does is to sleep for at least delay but it may be longer, depending on what else the operating system and the underlying hardware are doing at the moment. This means that there is a variability in the results of the program. An example of one thousand runs of the program is shown plotted below.
          enter image description here
          While in this case, most of the results were around the correct value of 2494.225 MHz, the results vary. The alternate way to do this would be to use a hardware-based timer instead and run the timing code as part of the kernel with interrupts disabled.



          Understand out-of-order execution and cache effects



          The variability of this program as it currently stands is such that it doesn't matter, but when attempting to do precise timing using the RDTSC instruction, it's important to understand that the processor does out-of-order execution. This means that instructions preceding or following the RDTSC instruction may actually be part of the duration of the event you're trying to measure unless precautions are taken. What is generally used is a serializing instruction such as CPUID to eliminate such effects. Intel has a useful whitepaper that describes this in detail and how to apply it.



          General advice



          What follows is more general advice about the coding style and structure.



          Avoid global variables



          The program_name and start_time variables are global variables. It's generally better to explicitly pass variables your function will need rather than using the vague implicit linkage of a global variable. If they must be globals (which is not the case here), then make them static.



          Be careful with size assumptions



          The code currently has these three lines:



          unsigned int lo,hi;
          __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
          return ((uint64_t)hi << 32) | lo;


          It's not wrong as it is, but I'd probably write that using explicit uint32_t sizes for hi and lo to eliminate the possibility of a compiler with a 16-bit unsigned int.



          Isolate code to a function



          Right now, the timing is done in main. I'd probably rewrite that to contain the timing code in a function:



          double approx_CPU_MHz(unsigned sleeptime) 
          using namespace std::chrono_literals;
          uint64_t cycles_start = get_cycles();
          double time_start = age();
          std::this_thread::sleep_for(sleeptime * 1ms);
          uint64_t elapsed_cycles = get_cycles() - cycles_start;
          double elapsed_time = age() - time_start;
          return elapsed_cycles / elapsed_time / 1000000.0;



          Think of the user



          The PrintUsage routine looks like this:



          void PrintUsage() 

          printf("Usage:n");
          printf("%s [measurment duration]n", program_name);



          There are three problems with that from the user's point of view. First, it doesn't tell the user the unit of measure for the measurement duration. Is it seconds? Minutes? Microseconds? The user would have to look at the source code to answer that question. Second, it could reasonably be interpreted to be two arguments, both of which are optional. Putting an underscore instead of a space would fix that. Third, the word "measurement" is misspelled.



          Use iostream



          It's not wrong to use printf but using iostreams is both more C++-like and can actually save some runtime processing. To evaluate printf, the computer has to interpret the format string first, while using << means that the compiler has already evaluated the argument type at compile-time.






          share|improve this answer





















            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "196"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );








             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f188574%2fmeasuring-the-clock-rate-of-an-intel-cpu-in-c%23new-answer', 'question_page');

            );

            Post as a guest






























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            11
            down vote













            There are a number of things that you might employ to improve your program.



            Accuracy




            Is there any reason the strategy used and described below won't produce accurate results?




            Yes! There are many reasons, which are each enumerated separately in items below, followed by more general information about code style and structure.



            Understand System Management Mode



            The code currently does not account for the possibilty of the processor entering System Management Mode (SMM). One simple way of putting the machine into SMM is to have it go into standby. My laptop is configured to go into standby when I close the lid and come back out when I open it again, so when I did that when running this program, I got this result:




            CPU MHz: 2365092677585.790




            While it would be lovely to actually have a processor that fast, my laptop does not, so this result is incorrect.



            Understand sleep_for



            What std::this_thread::sleep_for(delay) does is to sleep for at least delay but it may be longer, depending on what else the operating system and the underlying hardware are doing at the moment. This means that there is a variability in the results of the program. An example of one thousand runs of the program is shown plotted below.
            enter image description here
            While in this case, most of the results were around the correct value of 2494.225 MHz, the results vary. The alternate way to do this would be to use a hardware-based timer instead and run the timing code as part of the kernel with interrupts disabled.



            Understand out-of-order execution and cache effects



            The variability of this program as it currently stands is such that it doesn't matter, but when attempting to do precise timing using the RDTSC instruction, it's important to understand that the processor does out-of-order execution. This means that instructions preceding or following the RDTSC instruction may actually be part of the duration of the event you're trying to measure unless precautions are taken. What is generally used is a serializing instruction such as CPUID to eliminate such effects. Intel has a useful whitepaper that describes this in detail and how to apply it.



            General advice



            What follows is more general advice about the coding style and structure.



            Avoid global variables



            The program_name and start_time variables are global variables. It's generally better to explicitly pass variables your function will need rather than using the vague implicit linkage of a global variable. If they must be globals (which is not the case here), then make them static.



            Be careful with size assumptions



            The code currently has these three lines:



            unsigned int lo,hi;
            __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
            return ((uint64_t)hi << 32) | lo;


            It's not wrong as it is, but I'd probably write that using explicit uint32_t sizes for hi and lo to eliminate the possibility of a compiler with a 16-bit unsigned int.



            Isolate code to a function



            Right now, the timing is done in main. I'd probably rewrite that to contain the timing code in a function:



            double approx_CPU_MHz(unsigned sleeptime) 
            using namespace std::chrono_literals;
            uint64_t cycles_start = get_cycles();
            double time_start = age();
            std::this_thread::sleep_for(sleeptime * 1ms);
            uint64_t elapsed_cycles = get_cycles() - cycles_start;
            double elapsed_time = age() - time_start;
            return elapsed_cycles / elapsed_time / 1000000.0;



            Think of the user



            The PrintUsage routine looks like this:



            void PrintUsage() 

            printf("Usage:n");
            printf("%s [measurment duration]n", program_name);



            There are three problems with that from the user's point of view. First, it doesn't tell the user the unit of measure for the measurement duration. Is it seconds? Minutes? Microseconds? The user would have to look at the source code to answer that question. Second, it could reasonably be interpreted to be two arguments, both of which are optional. Putting an underscore instead of a space would fix that. Third, the word "measurement" is misspelled.



            Use iostream



            It's not wrong to use printf but using iostreams is both more C++-like and can actually save some runtime processing. To evaluate printf, the computer has to interpret the format string first, while using << means that the compiler has already evaluated the argument type at compile-time.






            share|improve this answer

























              up vote
              11
              down vote













              There are a number of things that you might employ to improve your program.



              Accuracy




              Is there any reason the strategy used and described below won't produce accurate results?




              Yes! There are many reasons, which are each enumerated separately in items below, followed by more general information about code style and structure.



              Understand System Management Mode



              The code currently does not account for the possibilty of the processor entering System Management Mode (SMM). One simple way of putting the machine into SMM is to have it go into standby. My laptop is configured to go into standby when I close the lid and come back out when I open it again, so when I did that when running this program, I got this result:




              CPU MHz: 2365092677585.790




              While it would be lovely to actually have a processor that fast, my laptop does not, so this result is incorrect.



              Understand sleep_for



              What std::this_thread::sleep_for(delay) does is to sleep for at least delay but it may be longer, depending on what else the operating system and the underlying hardware are doing at the moment. This means that there is a variability in the results of the program. An example of one thousand runs of the program is shown plotted below.
              enter image description here
              While in this case, most of the results were around the correct value of 2494.225 MHz, the results vary. The alternate way to do this would be to use a hardware-based timer instead and run the timing code as part of the kernel with interrupts disabled.



              Understand out-of-order execution and cache effects



              The variability of this program as it currently stands is such that it doesn't matter, but when attempting to do precise timing using the RDTSC instruction, it's important to understand that the processor does out-of-order execution. This means that instructions preceding or following the RDTSC instruction may actually be part of the duration of the event you're trying to measure unless precautions are taken. What is generally used is a serializing instruction such as CPUID to eliminate such effects. Intel has a useful whitepaper that describes this in detail and how to apply it.



              General advice



              What follows is more general advice about the coding style and structure.



              Avoid global variables



              The program_name and start_time variables are global variables. It's generally better to explicitly pass variables your function will need rather than using the vague implicit linkage of a global variable. If they must be globals (which is not the case here), then make them static.



              Be careful with size assumptions



              The code currently has these three lines:



              unsigned int lo,hi;
              __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
              return ((uint64_t)hi << 32) | lo;


              It's not wrong as it is, but I'd probably write that using explicit uint32_t sizes for hi and lo to eliminate the possibility of a compiler with a 16-bit unsigned int.



              Isolate code to a function



              Right now, the timing is done in main. I'd probably rewrite that to contain the timing code in a function:



              double approx_CPU_MHz(unsigned sleeptime) 
              using namespace std::chrono_literals;
              uint64_t cycles_start = get_cycles();
              double time_start = age();
              std::this_thread::sleep_for(sleeptime * 1ms);
              uint64_t elapsed_cycles = get_cycles() - cycles_start;
              double elapsed_time = age() - time_start;
              return elapsed_cycles / elapsed_time / 1000000.0;



              Think of the user



              The PrintUsage routine looks like this:



              void PrintUsage() 

              printf("Usage:n");
              printf("%s [measurment duration]n", program_name);



              There are three problems with that from the user's point of view. First, it doesn't tell the user the unit of measure for the measurement duration. Is it seconds? Minutes? Microseconds? The user would have to look at the source code to answer that question. Second, it could reasonably be interpreted to be two arguments, both of which are optional. Putting an underscore instead of a space would fix that. Third, the word "measurement" is misspelled.



              Use iostream



              It's not wrong to use printf but using iostreams is both more C++-like and can actually save some runtime processing. To evaluate printf, the computer has to interpret the format string first, while using << means that the compiler has already evaluated the argument type at compile-time.






              share|improve this answer























                up vote
                11
                down vote










                up vote
                11
                down vote









                There are a number of things that you might employ to improve your program.



                Accuracy




                Is there any reason the strategy used and described below won't produce accurate results?




                Yes! There are many reasons, which are each enumerated separately in items below, followed by more general information about code style and structure.



                Understand System Management Mode



                The code currently does not account for the possibilty of the processor entering System Management Mode (SMM). One simple way of putting the machine into SMM is to have it go into standby. My laptop is configured to go into standby when I close the lid and come back out when I open it again, so when I did that when running this program, I got this result:




                CPU MHz: 2365092677585.790




                While it would be lovely to actually have a processor that fast, my laptop does not, so this result is incorrect.



                Understand sleep_for



                What std::this_thread::sleep_for(delay) does is to sleep for at least delay but it may be longer, depending on what else the operating system and the underlying hardware are doing at the moment. This means that there is a variability in the results of the program. An example of one thousand runs of the program is shown plotted below.
                enter image description here
                While in this case, most of the results were around the correct value of 2494.225 MHz, the results vary. The alternate way to do this would be to use a hardware-based timer instead and run the timing code as part of the kernel with interrupts disabled.



                Understand out-of-order execution and cache effects



                The variability of this program as it currently stands is such that it doesn't matter, but when attempting to do precise timing using the RDTSC instruction, it's important to understand that the processor does out-of-order execution. This means that instructions preceding or following the RDTSC instruction may actually be part of the duration of the event you're trying to measure unless precautions are taken. What is generally used is a serializing instruction such as CPUID to eliminate such effects. Intel has a useful whitepaper that describes this in detail and how to apply it.



                General advice



                What follows is more general advice about the coding style and structure.



                Avoid global variables



                The program_name and start_time variables are global variables. It's generally better to explicitly pass variables your function will need rather than using the vague implicit linkage of a global variable. If they must be globals (which is not the case here), then make them static.



                Be careful with size assumptions



                The code currently has these three lines:



                unsigned int lo,hi;
                __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
                return ((uint64_t)hi << 32) | lo;


                It's not wrong as it is, but I'd probably write that using explicit uint32_t sizes for hi and lo to eliminate the possibility of a compiler with a 16-bit unsigned int.



                Isolate code to a function



                Right now, the timing is done in main. I'd probably rewrite that to contain the timing code in a function:



                double approx_CPU_MHz(unsigned sleeptime) 
                using namespace std::chrono_literals;
                uint64_t cycles_start = get_cycles();
                double time_start = age();
                std::this_thread::sleep_for(sleeptime * 1ms);
                uint64_t elapsed_cycles = get_cycles() - cycles_start;
                double elapsed_time = age() - time_start;
                return elapsed_cycles / elapsed_time / 1000000.0;



                Think of the user



                The PrintUsage routine looks like this:



                void PrintUsage() 

                printf("Usage:n");
                printf("%s [measurment duration]n", program_name);



                There are three problems with that from the user's point of view. First, it doesn't tell the user the unit of measure for the measurement duration. Is it seconds? Minutes? Microseconds? The user would have to look at the source code to answer that question. Second, it could reasonably be interpreted to be two arguments, both of which are optional. Putting an underscore instead of a space would fix that. Third, the word "measurement" is misspelled.



                Use iostream



                It's not wrong to use printf but using iostreams is both more C++-like and can actually save some runtime processing. To evaluate printf, the computer has to interpret the format string first, while using << means that the compiler has already evaluated the argument type at compile-time.






                share|improve this answer













                There are a number of things that you might employ to improve your program.



                Accuracy




                Is there any reason the strategy used and described below won't produce accurate results?




                Yes! There are many reasons, which are each enumerated separately in items below, followed by more general information about code style and structure.



                Understand System Management Mode



                The code currently does not account for the possibilty of the processor entering System Management Mode (SMM). One simple way of putting the machine into SMM is to have it go into standby. My laptop is configured to go into standby when I close the lid and come back out when I open it again, so when I did that when running this program, I got this result:




                CPU MHz: 2365092677585.790




                While it would be lovely to actually have a processor that fast, my laptop does not, so this result is incorrect.



                Understand sleep_for



                What std::this_thread::sleep_for(delay) does is to sleep for at least delay but it may be longer, depending on what else the operating system and the underlying hardware are doing at the moment. This means that there is a variability in the results of the program. An example of one thousand runs of the program is shown plotted below.
                enter image description here
                While in this case, most of the results were around the correct value of 2494.225 MHz, the results vary. The alternate way to do this would be to use a hardware-based timer instead and run the timing code as part of the kernel with interrupts disabled.



                Understand out-of-order execution and cache effects



                The variability of this program as it currently stands is such that it doesn't matter, but when attempting to do precise timing using the RDTSC instruction, it's important to understand that the processor does out-of-order execution. This means that instructions preceding or following the RDTSC instruction may actually be part of the duration of the event you're trying to measure unless precautions are taken. What is generally used is a serializing instruction such as CPUID to eliminate such effects. Intel has a useful whitepaper that describes this in detail and how to apply it.



                General advice



                What follows is more general advice about the coding style and structure.



                Avoid global variables



                The program_name and start_time variables are global variables. It's generally better to explicitly pass variables your function will need rather than using the vague implicit linkage of a global variable. If they must be globals (which is not the case here), then make them static.



                Be careful with size assumptions



                The code currently has these three lines:



                unsigned int lo,hi;
                __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
                return ((uint64_t)hi << 32) | lo;


                It's not wrong as it is, but I'd probably write that using explicit uint32_t sizes for hi and lo to eliminate the possibility of a compiler with a 16-bit unsigned int.



                Isolate code to a function



                Right now, the timing is done in main. I'd probably rewrite that to contain the timing code in a function:



                double approx_CPU_MHz(unsigned sleeptime) 
                using namespace std::chrono_literals;
                uint64_t cycles_start = get_cycles();
                double time_start = age();
                std::this_thread::sleep_for(sleeptime * 1ms);
                uint64_t elapsed_cycles = get_cycles() - cycles_start;
                double elapsed_time = age() - time_start;
                return elapsed_cycles / elapsed_time / 1000000.0;



                Think of the user



                The PrintUsage routine looks like this:



                void PrintUsage() 

                printf("Usage:n");
                printf("%s [measurment duration]n", program_name);



                There are three problems with that from the user's point of view. First, it doesn't tell the user the unit of measure for the measurement duration. Is it seconds? Minutes? Microseconds? The user would have to look at the source code to answer that question. Second, it could reasonably be interpreted to be two arguments, both of which are optional. Putting an underscore instead of a space would fix that. Third, the word "measurement" is misspelled.



                Use iostream



                It's not wrong to use printf but using iostreams is both more C++-like and can actually save some runtime processing. To evaluate printf, the computer has to interpret the format string first, while using << means that the compiler has already evaluated the argument type at compile-time.







                share|improve this answer













                share|improve this answer



                share|improve this answer











                answered Mar 1 at 16:58









                Edward

                44.3k374202




                44.3k374202






















                     

                    draft saved


                    draft discarded


























                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f188574%2fmeasuring-the-clock-rate-of-an-intel-cpu-in-c%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Popular posts from this blog

                    Greedy Best First Search implementation in Rust

                    Function to Return a JSON Like Objects Using VBA Collections and Arrays

                    C++11 CLH Lock Implementation