Memory management - Large amount of data
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
0
down vote
favorite
Context
I created a bundle to run asynchronous actions (using RabbitMQ) from data extracted by a service named extract_rule in the bundle context.
There are some definitions from documentationÃÂ :
An extract_rule refers to a symfony service that will retrieve an array of data. A task will be created for each item of this array.
An action is a service doing any work you want. It can be triggered by other previous actions in a definable and predictable order (composing what we call a workflow).
A workflow refers to the way actions are linked together. You can use conditions depending of the results of previous actions to trigger an action or another.
A task refers to multiple actions linked together for one extracted data. It's a mongo document, and can be used to resume actions if they failed- With all these, we can compose a task configuration to define how tasks are created and processed.
Here is a simple schema that will help get a picture of how tasks are created and processed. Each arrow can represent a RabbitMq message that is sent and will be consumed.
The data extraction is the entrypoint of the execution.
I am creating a custom extract rule and i'm facing with a memory problem...
This is a code snippet of the handler that will call the extract rule when neededÃÂ :
<?php
namespace IDCIBundleTaskBundleHandler;
use SymfonyComponentEventDispatcherEventDispatcherInterface;
use IDCIBundleTaskBundleModelAbstractTaskConfiguration;
use IDCIBundleTaskBundleEventDataExtractedEvent;
use IDCIBundleTaskBundleExtractRuleExtractRuleRegistry;
class ExtractRuleHandler
/**
* @var ExtractRuleRegistry
*/
protected $registry;
/**
* @var EventDispatcherInterface
*/
protected $dispatcher;
/**
* Constructor.
*
* @param ExtractRuleRegistry $registry
* @param EventDispatcherInterface $dispatcher
*/
public function __construct(
ExtractRuleRegistry $registry,
EventDispatcherInterface $dispatcher
)
$this->registry = $registry;
$this->dispatcher = $dispatcher;
/**
* Execute all extract rules and log for each
*
* @param AbstractTaskConfiguration $taskConfiguration
*/
public function execute(AbstractTaskConfiguration $taskConfiguration)
$extractRuleConfiguration = json_decode($taskConfiguration->getExtractRule(), true);
// Extract data
$extractedData = $this->registry
->getRule($extractRuleConfiguration['service'])
->extract($extractRuleConfiguration['parameters'])
;
// Dispatch event with extractData and taskConfiguration
$this->dispatcher->dispatch(
DataExtractedEvent::NAME,
new DataExtractedEvent($taskConfiguration, $extractedData)
);
As you can see above, the extract
method called directly without optimization. If someone create an extract rule that extracts millions of data it will be very long and intensive for the memory... Data can be extracted from everything possible (API, file, etc...).
I am sure I'm missing something in my extract rule management. I want to find a "generic" way to handle properly the memory and performance. I think this concept should be abstract for someone who want to create a custom extract rule.
Questions
- Are they concepts (design pattern like) to handle this problem ?
- Can someone enlighten me because i'm really lost ?
I really hope I have been clear,
Thanks a lot for your answers :)
php memory-management
add a comment |Â
up vote
0
down vote
favorite
Context
I created a bundle to run asynchronous actions (using RabbitMQ) from data extracted by a service named extract_rule in the bundle context.
There are some definitions from documentationÃÂ :
An extract_rule refers to a symfony service that will retrieve an array of data. A task will be created for each item of this array.
An action is a service doing any work you want. It can be triggered by other previous actions in a definable and predictable order (composing what we call a workflow).
A workflow refers to the way actions are linked together. You can use conditions depending of the results of previous actions to trigger an action or another.
A task refers to multiple actions linked together for one extracted data. It's a mongo document, and can be used to resume actions if they failed- With all these, we can compose a task configuration to define how tasks are created and processed.
Here is a simple schema that will help get a picture of how tasks are created and processed. Each arrow can represent a RabbitMq message that is sent and will be consumed.
The data extraction is the entrypoint of the execution.
I am creating a custom extract rule and i'm facing with a memory problem...
This is a code snippet of the handler that will call the extract rule when neededÃÂ :
<?php
namespace IDCIBundleTaskBundleHandler;
use SymfonyComponentEventDispatcherEventDispatcherInterface;
use IDCIBundleTaskBundleModelAbstractTaskConfiguration;
use IDCIBundleTaskBundleEventDataExtractedEvent;
use IDCIBundleTaskBundleExtractRuleExtractRuleRegistry;
class ExtractRuleHandler
/**
* @var ExtractRuleRegistry
*/
protected $registry;
/**
* @var EventDispatcherInterface
*/
protected $dispatcher;
/**
* Constructor.
*
* @param ExtractRuleRegistry $registry
* @param EventDispatcherInterface $dispatcher
*/
public function __construct(
ExtractRuleRegistry $registry,
EventDispatcherInterface $dispatcher
)
$this->registry = $registry;
$this->dispatcher = $dispatcher;
/**
* Execute all extract rules and log for each
*
* @param AbstractTaskConfiguration $taskConfiguration
*/
public function execute(AbstractTaskConfiguration $taskConfiguration)
$extractRuleConfiguration = json_decode($taskConfiguration->getExtractRule(), true);
// Extract data
$extractedData = $this->registry
->getRule($extractRuleConfiguration['service'])
->extract($extractRuleConfiguration['parameters'])
;
// Dispatch event with extractData and taskConfiguration
$this->dispatcher->dispatch(
DataExtractedEvent::NAME,
new DataExtractedEvent($taskConfiguration, $extractedData)
);
As you can see above, the extract
method called directly without optimization. If someone create an extract rule that extracts millions of data it will be very long and intensive for the memory... Data can be extracted from everything possible (API, file, etc...).
I am sure I'm missing something in my extract rule management. I want to find a "generic" way to handle properly the memory and performance. I think this concept should be abstract for someone who want to create a custom extract rule.
Questions
- Are they concepts (design pattern like) to handle this problem ?
- Can someone enlighten me because i'm really lost ?
I really hope I have been clear,
Thanks a lot for your answers :)
php memory-management
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Context
I created a bundle to run asynchronous actions (using RabbitMQ) from data extracted by a service named extract_rule in the bundle context.
There are some definitions from documentationÃÂ :
An extract_rule refers to a symfony service that will retrieve an array of data. A task will be created for each item of this array.
An action is a service doing any work you want. It can be triggered by other previous actions in a definable and predictable order (composing what we call a workflow).
A workflow refers to the way actions are linked together. You can use conditions depending of the results of previous actions to trigger an action or another.
A task refers to multiple actions linked together for one extracted data. It's a mongo document, and can be used to resume actions if they failed- With all these, we can compose a task configuration to define how tasks are created and processed.
Here is a simple schema that will help get a picture of how tasks are created and processed. Each arrow can represent a RabbitMq message that is sent and will be consumed.
The data extraction is the entrypoint of the execution.
I am creating a custom extract rule and i'm facing with a memory problem...
This is a code snippet of the handler that will call the extract rule when neededÃÂ :
<?php
namespace IDCIBundleTaskBundleHandler;
use SymfonyComponentEventDispatcherEventDispatcherInterface;
use IDCIBundleTaskBundleModelAbstractTaskConfiguration;
use IDCIBundleTaskBundleEventDataExtractedEvent;
use IDCIBundleTaskBundleExtractRuleExtractRuleRegistry;
class ExtractRuleHandler
/**
* @var ExtractRuleRegistry
*/
protected $registry;
/**
* @var EventDispatcherInterface
*/
protected $dispatcher;
/**
* Constructor.
*
* @param ExtractRuleRegistry $registry
* @param EventDispatcherInterface $dispatcher
*/
public function __construct(
ExtractRuleRegistry $registry,
EventDispatcherInterface $dispatcher
)
$this->registry = $registry;
$this->dispatcher = $dispatcher;
/**
* Execute all extract rules and log for each
*
* @param AbstractTaskConfiguration $taskConfiguration
*/
public function execute(AbstractTaskConfiguration $taskConfiguration)
$extractRuleConfiguration = json_decode($taskConfiguration->getExtractRule(), true);
// Extract data
$extractedData = $this->registry
->getRule($extractRuleConfiguration['service'])
->extract($extractRuleConfiguration['parameters'])
;
// Dispatch event with extractData and taskConfiguration
$this->dispatcher->dispatch(
DataExtractedEvent::NAME,
new DataExtractedEvent($taskConfiguration, $extractedData)
);
As you can see above, the extract
method called directly without optimization. If someone create an extract rule that extracts millions of data it will be very long and intensive for the memory... Data can be extracted from everything possible (API, file, etc...).
I am sure I'm missing something in my extract rule management. I want to find a "generic" way to handle properly the memory and performance. I think this concept should be abstract for someone who want to create a custom extract rule.
Questions
- Are they concepts (design pattern like) to handle this problem ?
- Can someone enlighten me because i'm really lost ?
I really hope I have been clear,
Thanks a lot for your answers :)
php memory-management
Context
I created a bundle to run asynchronous actions (using RabbitMQ) from data extracted by a service named extract_rule in the bundle context.
There are some definitions from documentationÃÂ :
An extract_rule refers to a symfony service that will retrieve an array of data. A task will be created for each item of this array.
An action is a service doing any work you want. It can be triggered by other previous actions in a definable and predictable order (composing what we call a workflow).
A workflow refers to the way actions are linked together. You can use conditions depending of the results of previous actions to trigger an action or another.
A task refers to multiple actions linked together for one extracted data. It's a mongo document, and can be used to resume actions if they failed- With all these, we can compose a task configuration to define how tasks are created and processed.
Here is a simple schema that will help get a picture of how tasks are created and processed. Each arrow can represent a RabbitMq message that is sent and will be consumed.
The data extraction is the entrypoint of the execution.
I am creating a custom extract rule and i'm facing with a memory problem...
This is a code snippet of the handler that will call the extract rule when neededÃÂ :
<?php
namespace IDCIBundleTaskBundleHandler;
use SymfonyComponentEventDispatcherEventDispatcherInterface;
use IDCIBundleTaskBundleModelAbstractTaskConfiguration;
use IDCIBundleTaskBundleEventDataExtractedEvent;
use IDCIBundleTaskBundleExtractRuleExtractRuleRegistry;
class ExtractRuleHandler
/**
* @var ExtractRuleRegistry
*/
protected $registry;
/**
* @var EventDispatcherInterface
*/
protected $dispatcher;
/**
* Constructor.
*
* @param ExtractRuleRegistry $registry
* @param EventDispatcherInterface $dispatcher
*/
public function __construct(
ExtractRuleRegistry $registry,
EventDispatcherInterface $dispatcher
)
$this->registry = $registry;
$this->dispatcher = $dispatcher;
/**
* Execute all extract rules and log for each
*
* @param AbstractTaskConfiguration $taskConfiguration
*/
public function execute(AbstractTaskConfiguration $taskConfiguration)
$extractRuleConfiguration = json_decode($taskConfiguration->getExtractRule(), true);
// Extract data
$extractedData = $this->registry
->getRule($extractRuleConfiguration['service'])
->extract($extractRuleConfiguration['parameters'])
;
// Dispatch event with extractData and taskConfiguration
$this->dispatcher->dispatch(
DataExtractedEvent::NAME,
new DataExtractedEvent($taskConfiguration, $extractedData)
);
As you can see above, the extract
method called directly without optimization. If someone create an extract rule that extracts millions of data it will be very long and intensive for the memory... Data can be extracted from everything possible (API, file, etc...).
I am sure I'm missing something in my extract rule management. I want to find a "generic" way to handle properly the memory and performance. I think this concept should be abstract for someone who want to create a custom extract rule.
Questions
- Are they concepts (design pattern like) to handle this problem ?
- Can someone enlighten me because i'm really lost ?
I really hope I have been clear,
Thanks a lot for your answers :)
php memory-management
edited Jun 6 at 13:56
asked Jun 6 at 12:32
BwaBwa
14
14
add a comment |Â
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f195953%2fmemory-management-large-amount-of-data%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password