Ruby Connected Components in a Graph

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
2
down vote

favorite












I've written the following code to find the connected components of a graph in ruby, but it is quite slow since I don't use any sensible data structure for the disjoint sets, and hence find_set is probably $O(n^2)$ (don't know the complexity for enumerable.find in Ruby).



Can anybody help me improve it? I am relatively new to ruby, and so would appreciate any insights. This code is meant to solve the problem posed Here:



My solution, in plain English, is the find the number of connected components in the graph given, then either calculate the cost of building a library in every node or the cost of building a library in each connected component plus the cost of building the roads for the minimum spanning tree of each component.



The Code



#!/bin/ruby

require 'set'

def roadsAndLibraries(n, c_lib, c_road, cities)
vertices = make_set((1..n).to_a)

#find connected components
cities.each do |a, b|
first_set = find_set(vertices, a)
second_set = find_set(vertices, b)
if first_set != second_set
vertices.add(first_set | second_set)
vertices.delete(first_set)
vertices.delete(second_set)
end
end

#calculate the cost of building the edges required for a MST of each component
roads_cost = c_road * vertices.to_a.map x.size - 1.inject(:+)

if c_road > c_lib
return n * c_lib
else
return vertices.size * c_lib + roads_cost
end

end

#takes an array of integers, and maps them into a set of singleton sets containing each node number
def make_set(arr_n)
vertices = Set.new
for i in 1..arr_n.size
vertices.add([i])
end
vertices.map!
end

#finds set containing a given node value
def find_set(set, node)
set.find x.include?(node)
end





# q is number of queries, n is the number of vertices in the graph
q = gets.strip.to_i
for a0 in (0..q-1)
n, m, c_lib, c_road = gets.strip.split(' ')
n = n.to_i
m = m.to_i
c_lib = c_lib.to_i
c_road = c_road.to_i
cities = Array.new(m)
for cities_i in (0..m-1)
cities_t = gets.strip
cities[cities_i] = cities_t.split(' ').map(&:to_i)
end
result = roadsAndLibraries(n, c_lib, c_road, cities)
puts result
end






share|improve this question



























    up vote
    2
    down vote

    favorite












    I've written the following code to find the connected components of a graph in ruby, but it is quite slow since I don't use any sensible data structure for the disjoint sets, and hence find_set is probably $O(n^2)$ (don't know the complexity for enumerable.find in Ruby).



    Can anybody help me improve it? I am relatively new to ruby, and so would appreciate any insights. This code is meant to solve the problem posed Here:



    My solution, in plain English, is the find the number of connected components in the graph given, then either calculate the cost of building a library in every node or the cost of building a library in each connected component plus the cost of building the roads for the minimum spanning tree of each component.



    The Code



    #!/bin/ruby

    require 'set'

    def roadsAndLibraries(n, c_lib, c_road, cities)
    vertices = make_set((1..n).to_a)

    #find connected components
    cities.each do |a, b|
    first_set = find_set(vertices, a)
    second_set = find_set(vertices, b)
    if first_set != second_set
    vertices.add(first_set | second_set)
    vertices.delete(first_set)
    vertices.delete(second_set)
    end
    end

    #calculate the cost of building the edges required for a MST of each component
    roads_cost = c_road * vertices.to_a.map x.size - 1.inject(:+)

    if c_road > c_lib
    return n * c_lib
    else
    return vertices.size * c_lib + roads_cost
    end

    end

    #takes an array of integers, and maps them into a set of singleton sets containing each node number
    def make_set(arr_n)
    vertices = Set.new
    for i in 1..arr_n.size
    vertices.add([i])
    end
    vertices.map!
    end

    #finds set containing a given node value
    def find_set(set, node)
    set.find x.include?(node)
    end





    # q is number of queries, n is the number of vertices in the graph
    q = gets.strip.to_i
    for a0 in (0..q-1)
    n, m, c_lib, c_road = gets.strip.split(' ')
    n = n.to_i
    m = m.to_i
    c_lib = c_lib.to_i
    c_road = c_road.to_i
    cities = Array.new(m)
    for cities_i in (0..m-1)
    cities_t = gets.strip
    cities[cities_i] = cities_t.split(' ').map(&:to_i)
    end
    result = roadsAndLibraries(n, c_lib, c_road, cities)
    puts result
    end






    share|improve this question























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      I've written the following code to find the connected components of a graph in ruby, but it is quite slow since I don't use any sensible data structure for the disjoint sets, and hence find_set is probably $O(n^2)$ (don't know the complexity for enumerable.find in Ruby).



      Can anybody help me improve it? I am relatively new to ruby, and so would appreciate any insights. This code is meant to solve the problem posed Here:



      My solution, in plain English, is the find the number of connected components in the graph given, then either calculate the cost of building a library in every node or the cost of building a library in each connected component plus the cost of building the roads for the minimum spanning tree of each component.



      The Code



      #!/bin/ruby

      require 'set'

      def roadsAndLibraries(n, c_lib, c_road, cities)
      vertices = make_set((1..n).to_a)

      #find connected components
      cities.each do |a, b|
      first_set = find_set(vertices, a)
      second_set = find_set(vertices, b)
      if first_set != second_set
      vertices.add(first_set | second_set)
      vertices.delete(first_set)
      vertices.delete(second_set)
      end
      end

      #calculate the cost of building the edges required for a MST of each component
      roads_cost = c_road * vertices.to_a.map x.size - 1.inject(:+)

      if c_road > c_lib
      return n * c_lib
      else
      return vertices.size * c_lib + roads_cost
      end

      end

      #takes an array of integers, and maps them into a set of singleton sets containing each node number
      def make_set(arr_n)
      vertices = Set.new
      for i in 1..arr_n.size
      vertices.add([i])
      end
      vertices.map!
      end

      #finds set containing a given node value
      def find_set(set, node)
      set.find x.include?(node)
      end





      # q is number of queries, n is the number of vertices in the graph
      q = gets.strip.to_i
      for a0 in (0..q-1)
      n, m, c_lib, c_road = gets.strip.split(' ')
      n = n.to_i
      m = m.to_i
      c_lib = c_lib.to_i
      c_road = c_road.to_i
      cities = Array.new(m)
      for cities_i in (0..m-1)
      cities_t = gets.strip
      cities[cities_i] = cities_t.split(' ').map(&:to_i)
      end
      result = roadsAndLibraries(n, c_lib, c_road, cities)
      puts result
      end






      share|improve this question













      I've written the following code to find the connected components of a graph in ruby, but it is quite slow since I don't use any sensible data structure for the disjoint sets, and hence find_set is probably $O(n^2)$ (don't know the complexity for enumerable.find in Ruby).



      Can anybody help me improve it? I am relatively new to ruby, and so would appreciate any insights. This code is meant to solve the problem posed Here:



      My solution, in plain English, is the find the number of connected components in the graph given, then either calculate the cost of building a library in every node or the cost of building a library in each connected component plus the cost of building the roads for the minimum spanning tree of each component.



      The Code



      #!/bin/ruby

      require 'set'

      def roadsAndLibraries(n, c_lib, c_road, cities)
      vertices = make_set((1..n).to_a)

      #find connected components
      cities.each do |a, b|
      first_set = find_set(vertices, a)
      second_set = find_set(vertices, b)
      if first_set != second_set
      vertices.add(first_set | second_set)
      vertices.delete(first_set)
      vertices.delete(second_set)
      end
      end

      #calculate the cost of building the edges required for a MST of each component
      roads_cost = c_road * vertices.to_a.map x.size - 1.inject(:+)

      if c_road > c_lib
      return n * c_lib
      else
      return vertices.size * c_lib + roads_cost
      end

      end

      #takes an array of integers, and maps them into a set of singleton sets containing each node number
      def make_set(arr_n)
      vertices = Set.new
      for i in 1..arr_n.size
      vertices.add([i])
      end
      vertices.map!
      end

      #finds set containing a given node value
      def find_set(set, node)
      set.find x.include?(node)
      end





      # q is number of queries, n is the number of vertices in the graph
      q = gets.strip.to_i
      for a0 in (0..q-1)
      n, m, c_lib, c_road = gets.strip.split(' ')
      n = n.to_i
      m = m.to_i
      c_lib = c_lib.to_i
      c_road = c_road.to_i
      cities = Array.new(m)
      for cities_i in (0..m-1)
      cities_t = gets.strip
      cities[cities_i] = cities_t.split(' ').map(&:to_i)
      end
      result = roadsAndLibraries(n, c_lib, c_road, cities)
      puts result
      end








      share|improve this question












      share|improve this question




      share|improve this question








      edited Jan 11 at 9:25









      Billal BEGUERADJ

      1




      1









      asked Jan 8 at 11:42









      Darcy Harding

      135




      135




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          3
          down vote



          accepted










          It Your solution being slow is not too much a matter of language here.
          I will try highlighting ruby related things you could improve to make it more readable on one hand,
          and algorithmic flaws on the other hand.



          Ruby



          Here I will point out some of the ruby code that could be improved. Note that this will not help
          your solution to be much faster since it is mostly an algorithmic problem you have here.



          Find Set



          Your find_set method seems fine. It is a O(n) method. You have to go through every set to find
          the one which includes the node you want. The #include? method of a set is constant in time, and
          that is a good use of set here.



          Make Set



          Your make_set method could be made shorter and cleaner. For instance:



          def make_set(n)
          Set.new((1..n).map i)
          end


          Map + Inject



          I would not recommand using #map and then #inject as it will create an intermediate array, and
          because you can have all in the #inject, like:



          vertices.inject(0) 


          And by the way, you can get rid of to_a since sets accept the #inject and #map methods too.



          Method name



          You should use snake case: roadsAndLibraries becomes roads_and_libraries.



          Algorithmic enhancements



          Unneeded computation



          Itseems to me you don't need any computation if the road is more expensive than the library.
          You can move this to the top of your roadsAndLibraries method.



          return n * c_lib if c_road > c_lib


          Sets lookup



          The main issue with your solution here is probably the lookup you do for each edge. You parse the
          whole set of vertices for each city. This is expensive. You could try having a constant lookup
          here that would help getting the connected component of a vertice on O(1).



          You can use 2 arrays here. One for storing the connected components, like you already do. But
          instead of storing them in a set, you can store them in an array, so they have an associated index.
          Let's call it components.



          And then the second array can be the lookup, we will call it component_index. For each node, you
          store the index of its connected component. That way, all find_set has to do is to take the
          stored index: components[component_index[a]] for city a.



          Your main loop could look like:



          cities.each do |a, b|
          first_set = component_index[a]
          second_set = component_index[b]
          next if first_set == second_set
          merge_components(component_index, components, first_set, second_set)
          end


          Merging components



          Where the algorithm will spend time is when we merge components together. Here you will have to
          update component indexes. What you have to see here is that it will be performed much less often
          than the lookup, especially in case of strongly connected components, where first_set and
          second_set of the main loop will be equal.



          To merge 2 components, I would put the nodes of the smallest component into the largest one, to
          avoid updating too many indexes. And then I would update the indexed component so that it contains
          all nodes.



          def merge_components(component_index, components, first_set, second_set)
          to = first_set
          from = second_set
          to, from = from, to if components[to].size < components[from].size

          components[from].each do |i|
          component_index[i] = to
          end
          components[to].merge components[from]
          end


          In case memory is a problem, you probably don't need to store the sets themselves, but only their
          size, since you know that they will never overlap, and that the size of the merged set is always
          the sum of the sizes of the 2 sets. But by doing so you have to loop through all nodes when
          updating components, which can be much longer.



          Building Cost



          Last thing to change: how we compute the cost of the solution. We will need the final connected
          components. This can be extracted like:



          final_component_index = component_index.uniq


          Then the MST cost would be



           roads_cost = c_road * final_component_index.inject(0) 


          And the final cost



           final_component_index.size * c_lib + roads_cost





          share|improve this answer





















            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "196"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );








             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f184576%2fruby-connected-components-in-a-graph%23new-answer', 'question_page');

            );

            Post as a guest






























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            3
            down vote



            accepted










            It Your solution being slow is not too much a matter of language here.
            I will try highlighting ruby related things you could improve to make it more readable on one hand,
            and algorithmic flaws on the other hand.



            Ruby



            Here I will point out some of the ruby code that could be improved. Note that this will not help
            your solution to be much faster since it is mostly an algorithmic problem you have here.



            Find Set



            Your find_set method seems fine. It is a O(n) method. You have to go through every set to find
            the one which includes the node you want. The #include? method of a set is constant in time, and
            that is a good use of set here.



            Make Set



            Your make_set method could be made shorter and cleaner. For instance:



            def make_set(n)
            Set.new((1..n).map i)
            end


            Map + Inject



            I would not recommand using #map and then #inject as it will create an intermediate array, and
            because you can have all in the #inject, like:



            vertices.inject(0) 


            And by the way, you can get rid of to_a since sets accept the #inject and #map methods too.



            Method name



            You should use snake case: roadsAndLibraries becomes roads_and_libraries.



            Algorithmic enhancements



            Unneeded computation



            Itseems to me you don't need any computation if the road is more expensive than the library.
            You can move this to the top of your roadsAndLibraries method.



            return n * c_lib if c_road > c_lib


            Sets lookup



            The main issue with your solution here is probably the lookup you do for each edge. You parse the
            whole set of vertices for each city. This is expensive. You could try having a constant lookup
            here that would help getting the connected component of a vertice on O(1).



            You can use 2 arrays here. One for storing the connected components, like you already do. But
            instead of storing them in a set, you can store them in an array, so they have an associated index.
            Let's call it components.



            And then the second array can be the lookup, we will call it component_index. For each node, you
            store the index of its connected component. That way, all find_set has to do is to take the
            stored index: components[component_index[a]] for city a.



            Your main loop could look like:



            cities.each do |a, b|
            first_set = component_index[a]
            second_set = component_index[b]
            next if first_set == second_set
            merge_components(component_index, components, first_set, second_set)
            end


            Merging components



            Where the algorithm will spend time is when we merge components together. Here you will have to
            update component indexes. What you have to see here is that it will be performed much less often
            than the lookup, especially in case of strongly connected components, where first_set and
            second_set of the main loop will be equal.



            To merge 2 components, I would put the nodes of the smallest component into the largest one, to
            avoid updating too many indexes. And then I would update the indexed component so that it contains
            all nodes.



            def merge_components(component_index, components, first_set, second_set)
            to = first_set
            from = second_set
            to, from = from, to if components[to].size < components[from].size

            components[from].each do |i|
            component_index[i] = to
            end
            components[to].merge components[from]
            end


            In case memory is a problem, you probably don't need to store the sets themselves, but only their
            size, since you know that they will never overlap, and that the size of the merged set is always
            the sum of the sizes of the 2 sets. But by doing so you have to loop through all nodes when
            updating components, which can be much longer.



            Building Cost



            Last thing to change: how we compute the cost of the solution. We will need the final connected
            components. This can be extracted like:



            final_component_index = component_index.uniq


            Then the MST cost would be



             roads_cost = c_road * final_component_index.inject(0) 


            And the final cost



             final_component_index.size * c_lib + roads_cost





            share|improve this answer

























              up vote
              3
              down vote



              accepted










              It Your solution being slow is not too much a matter of language here.
              I will try highlighting ruby related things you could improve to make it more readable on one hand,
              and algorithmic flaws on the other hand.



              Ruby



              Here I will point out some of the ruby code that could be improved. Note that this will not help
              your solution to be much faster since it is mostly an algorithmic problem you have here.



              Find Set



              Your find_set method seems fine. It is a O(n) method. You have to go through every set to find
              the one which includes the node you want. The #include? method of a set is constant in time, and
              that is a good use of set here.



              Make Set



              Your make_set method could be made shorter and cleaner. For instance:



              def make_set(n)
              Set.new((1..n).map i)
              end


              Map + Inject



              I would not recommand using #map and then #inject as it will create an intermediate array, and
              because you can have all in the #inject, like:



              vertices.inject(0) 


              And by the way, you can get rid of to_a since sets accept the #inject and #map methods too.



              Method name



              You should use snake case: roadsAndLibraries becomes roads_and_libraries.



              Algorithmic enhancements



              Unneeded computation



              Itseems to me you don't need any computation if the road is more expensive than the library.
              You can move this to the top of your roadsAndLibraries method.



              return n * c_lib if c_road > c_lib


              Sets lookup



              The main issue with your solution here is probably the lookup you do for each edge. You parse the
              whole set of vertices for each city. This is expensive. You could try having a constant lookup
              here that would help getting the connected component of a vertice on O(1).



              You can use 2 arrays here. One for storing the connected components, like you already do. But
              instead of storing them in a set, you can store them in an array, so they have an associated index.
              Let's call it components.



              And then the second array can be the lookup, we will call it component_index. For each node, you
              store the index of its connected component. That way, all find_set has to do is to take the
              stored index: components[component_index[a]] for city a.



              Your main loop could look like:



              cities.each do |a, b|
              first_set = component_index[a]
              second_set = component_index[b]
              next if first_set == second_set
              merge_components(component_index, components, first_set, second_set)
              end


              Merging components



              Where the algorithm will spend time is when we merge components together. Here you will have to
              update component indexes. What you have to see here is that it will be performed much less often
              than the lookup, especially in case of strongly connected components, where first_set and
              second_set of the main loop will be equal.



              To merge 2 components, I would put the nodes of the smallest component into the largest one, to
              avoid updating too many indexes. And then I would update the indexed component so that it contains
              all nodes.



              def merge_components(component_index, components, first_set, second_set)
              to = first_set
              from = second_set
              to, from = from, to if components[to].size < components[from].size

              components[from].each do |i|
              component_index[i] = to
              end
              components[to].merge components[from]
              end


              In case memory is a problem, you probably don't need to store the sets themselves, but only their
              size, since you know that they will never overlap, and that the size of the merged set is always
              the sum of the sizes of the 2 sets. But by doing so you have to loop through all nodes when
              updating components, which can be much longer.



              Building Cost



              Last thing to change: how we compute the cost of the solution. We will need the final connected
              components. This can be extracted like:



              final_component_index = component_index.uniq


              Then the MST cost would be



               roads_cost = c_road * final_component_index.inject(0) 


              And the final cost



               final_component_index.size * c_lib + roads_cost





              share|improve this answer























                up vote
                3
                down vote



                accepted







                up vote
                3
                down vote



                accepted






                It Your solution being slow is not too much a matter of language here.
                I will try highlighting ruby related things you could improve to make it more readable on one hand,
                and algorithmic flaws on the other hand.



                Ruby



                Here I will point out some of the ruby code that could be improved. Note that this will not help
                your solution to be much faster since it is mostly an algorithmic problem you have here.



                Find Set



                Your find_set method seems fine. It is a O(n) method. You have to go through every set to find
                the one which includes the node you want. The #include? method of a set is constant in time, and
                that is a good use of set here.



                Make Set



                Your make_set method could be made shorter and cleaner. For instance:



                def make_set(n)
                Set.new((1..n).map i)
                end


                Map + Inject



                I would not recommand using #map and then #inject as it will create an intermediate array, and
                because you can have all in the #inject, like:



                vertices.inject(0) 


                And by the way, you can get rid of to_a since sets accept the #inject and #map methods too.



                Method name



                You should use snake case: roadsAndLibraries becomes roads_and_libraries.



                Algorithmic enhancements



                Unneeded computation



                Itseems to me you don't need any computation if the road is more expensive than the library.
                You can move this to the top of your roadsAndLibraries method.



                return n * c_lib if c_road > c_lib


                Sets lookup



                The main issue with your solution here is probably the lookup you do for each edge. You parse the
                whole set of vertices for each city. This is expensive. You could try having a constant lookup
                here that would help getting the connected component of a vertice on O(1).



                You can use 2 arrays here. One for storing the connected components, like you already do. But
                instead of storing them in a set, you can store them in an array, so they have an associated index.
                Let's call it components.



                And then the second array can be the lookup, we will call it component_index. For each node, you
                store the index of its connected component. That way, all find_set has to do is to take the
                stored index: components[component_index[a]] for city a.



                Your main loop could look like:



                cities.each do |a, b|
                first_set = component_index[a]
                second_set = component_index[b]
                next if first_set == second_set
                merge_components(component_index, components, first_set, second_set)
                end


                Merging components



                Where the algorithm will spend time is when we merge components together. Here you will have to
                update component indexes. What you have to see here is that it will be performed much less often
                than the lookup, especially in case of strongly connected components, where first_set and
                second_set of the main loop will be equal.



                To merge 2 components, I would put the nodes of the smallest component into the largest one, to
                avoid updating too many indexes. And then I would update the indexed component so that it contains
                all nodes.



                def merge_components(component_index, components, first_set, second_set)
                to = first_set
                from = second_set
                to, from = from, to if components[to].size < components[from].size

                components[from].each do |i|
                component_index[i] = to
                end
                components[to].merge components[from]
                end


                In case memory is a problem, you probably don't need to store the sets themselves, but only their
                size, since you know that they will never overlap, and that the size of the merged set is always
                the sum of the sizes of the 2 sets. But by doing so you have to loop through all nodes when
                updating components, which can be much longer.



                Building Cost



                Last thing to change: how we compute the cost of the solution. We will need the final connected
                components. This can be extracted like:



                final_component_index = component_index.uniq


                Then the MST cost would be



                 roads_cost = c_road * final_component_index.inject(0) 


                And the final cost



                 final_component_index.size * c_lib + roads_cost





                share|improve this answer













                It Your solution being slow is not too much a matter of language here.
                I will try highlighting ruby related things you could improve to make it more readable on one hand,
                and algorithmic flaws on the other hand.



                Ruby



                Here I will point out some of the ruby code that could be improved. Note that this will not help
                your solution to be much faster since it is mostly an algorithmic problem you have here.



                Find Set



                Your find_set method seems fine. It is a O(n) method. You have to go through every set to find
                the one which includes the node you want. The #include? method of a set is constant in time, and
                that is a good use of set here.



                Make Set



                Your make_set method could be made shorter and cleaner. For instance:



                def make_set(n)
                Set.new((1..n).map i)
                end


                Map + Inject



                I would not recommand using #map and then #inject as it will create an intermediate array, and
                because you can have all in the #inject, like:



                vertices.inject(0) 


                And by the way, you can get rid of to_a since sets accept the #inject and #map methods too.



                Method name



                You should use snake case: roadsAndLibraries becomes roads_and_libraries.



                Algorithmic enhancements



                Unneeded computation



                Itseems to me you don't need any computation if the road is more expensive than the library.
                You can move this to the top of your roadsAndLibraries method.



                return n * c_lib if c_road > c_lib


                Sets lookup



                The main issue with your solution here is probably the lookup you do for each edge. You parse the
                whole set of vertices for each city. This is expensive. You could try having a constant lookup
                here that would help getting the connected component of a vertice on O(1).



                You can use 2 arrays here. One for storing the connected components, like you already do. But
                instead of storing them in a set, you can store them in an array, so they have an associated index.
                Let's call it components.



                And then the second array can be the lookup, we will call it component_index. For each node, you
                store the index of its connected component. That way, all find_set has to do is to take the
                stored index: components[component_index[a]] for city a.



                Your main loop could look like:



                cities.each do |a, b|
                first_set = component_index[a]
                second_set = component_index[b]
                next if first_set == second_set
                merge_components(component_index, components, first_set, second_set)
                end


                Merging components



                Where the algorithm will spend time is when we merge components together. Here you will have to
                update component indexes. What you have to see here is that it will be performed much less often
                than the lookup, especially in case of strongly connected components, where first_set and
                second_set of the main loop will be equal.



                To merge 2 components, I would put the nodes of the smallest component into the largest one, to
                avoid updating too many indexes. And then I would update the indexed component so that it contains
                all nodes.



                def merge_components(component_index, components, first_set, second_set)
                to = first_set
                from = second_set
                to, from = from, to if components[to].size < components[from].size

                components[from].each do |i|
                component_index[i] = to
                end
                components[to].merge components[from]
                end


                In case memory is a problem, you probably don't need to store the sets themselves, but only their
                size, since you know that they will never overlap, and that the size of the merged set is always
                the sum of the sizes of the 2 sets. But by doing so you have to loop through all nodes when
                updating components, which can be much longer.



                Building Cost



                Last thing to change: how we compute the cost of the solution. We will need the final connected
                components. This can be extracted like:



                final_component_index = component_index.uniq


                Then the MST cost would be



                 roads_cost = c_road * final_component_index.inject(0) 


                And the final cost



                 final_component_index.size * c_lib + roads_cost






                share|improve this answer













                share|improve this answer



                share|improve this answer











                answered Jan 11 at 9:05









                Guillaume

                1064




                1064






















                     

                    draft saved


                    draft discarded


























                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f184576%2fruby-connected-components-in-a-graph%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Popular posts from this blog

                    Python Lists

                    Aion

                    JavaScript Array Iteration Methods