Page MenuHomePhabricator

Move Spitfire storage definitions to explicit file
Closed, ResolvedPublic

Description

Currently there's no specific spot to place storage definitions. Spitfire has a functions file, buried deep within the framework itself and not really intended as a user modifiable space. On the other hand, the environments file provides a user configurable location for settings, but a bad one for storage definitions, which are a bit more technical - and would be more fit to developers or system operators rather than users deploying an application.

Usually an application has the following predefined definitions inside of Spitfire:

$dispatcher = new \spitfire\storage\objectStorage\DriveDispatcher();
$dispatcher->register(new \spitfire\storage\drive\MountPoint('file://', '/'));
$dispatcher->register(new \spitfire\storage\drive\MountPoint('app://', basedir()));
$dispatcher->register(new \spitfire\storage\drive\MountPoint('temp://', sys_get_temp_dir()));

These allow an application to access a file like this:

storage()->get('app://bin/settings/environments.php');

And in combination with the Environment class something like:

storage()->dir(Environment::get('uploads.directory'))->get('upload.txt');

This is fine, but the way these are registered leaves a bit to be desired.

The current implementation distributes the file system definitions across two locations. On one hand, the spitfire\core\functions.php file. On the other, the bin/settings/environments.php file for user defined storage definitions. It would be much easier to manage the entire system if there was a single bin/settings/storage.php file that contains all the definitions.

Placing all storage definitions in the environments.php file would be possible, and would result in less clutter on the file system. A single file that contains all the system's configuration sounds promising, but could also lead to users being confused about the options. A storage.php file could also almost create a dialogue between the system operator and the developer. The developer can define an unlimited number of storage endpoints in their application code that would then be assigned to physical locations by the system administrator when deploying the application.

Option A: An additional storage definition file

A storage.php would have the following pros:

  • Setting segmentation. The user would immediately know that storage definitions are supposed to be found in this file.
  • Noob deterrent: If a user feels like changing application settings, they head to environments.php. They would not immediately venture into storage.php

But it also has a series of cons:

  • File system clutter. This file needs to be managed separately by CVS, Configuration deployment scripts, etc. Which adds additional workload
  • Potential confusion. While it should be mandatory to write config into this directory, PHP won't enforce it. Making it hard to actually ensure the code is where it's supposed to be.

Option B: Moving all the settings to environments

This is another very promising option, which assumes that the environments file is only ever managed by system operators anyway. It assumes that only settings required to make the application boot are supposed to be included in these files (storage, database, network, authentication, etc). It has the following pros:

  • With a single environments file, you only have one file to be excluded from git, only one file to push via configuration management, and the option to remove the file entirely and push the configuration to the system's environment.
  • A single location to put user config into.
  • Environments are built so they can deploy across several servers and load conditional configuration. The config on the live server could be different from the one on the stage server and that could be reflected. But this is rather minor pro, since I think that multiple environments on a single system are starting to get deprecated.

But it also has a series of cons:

  • Users may consider static application settings to be user editable. The application may require the temp directory to be somewhere specific, and would have a hard time dealing with the fact that a user changed it. Although this is bad application design, it's sufficiently common.
  • The environments.php file may get cluttered.

I think I need your input @sebastian , thoughts?

Revisions and Commits

rSF Spitfire Engine
Restricted Differential Revision

Event Timeline

cesar triaged this task as Normal priority.Oct 28 2019, 10:53 AM
cesar created this task.
cesar created this object with visibility "Public (No Login Required)".

That's actually a hard decision. In my opinion, I would go with option B.
Sure, both options have their pros and cons but with option B it's maybe easier for developing and configuration, as you mentioned in the first point of your pros.

What about the idea of option B with the user-editable variables on the top and the developer-editable ones down below. Separated by a big block of comment lines. Like:

**User editable settings**

/*
 * End of user-editable settings!
 * 
 * These upcoming lines are only settings that should be edited by the application developer.
 * Be careful by editing these settings!
 */

**Developer editable settings**
cesar added a comment.Oct 31 2019, 5:01 PM

If experience tells us anything about humans, it's that they don't care for stuff like that. Any bigger or more complicated app will probably start to mix the variables up sooner or later, some variables may also not be as straightforward to decide. For example, something like a Google Analytics ID is probably a really tough call to make - who sets that? The user, the admin, devops?

I think it would be good to construct a system that makes it convenient for application developers to make settings that the user is supposed to change available from the user interface, instead of burying them in a system settings file. Maybe something like an abstract controller they can inherit from and quickly define permissions and editable settings so a user can modify them from the UI easily.

Maybe that's even something to consider integrating into Fabric. Maybe we can discuss this further on monday

cesar added a comment.Nov 11 2019, 1:37 PM

Spitfire applications from now on would need the following code in their environments.php file to work like they have until now:

storage()->register(new \spitfire\storage\drive\MountPoint('file://', '/'));
storage()->register(new \spitfire\storage\drive\MountPoint('app://', basedir()));
storage()->register(new \spitfire\storage\drive\MountPoint('temp://', sys_get_temp_dir()));

My recommendation is to include these into the environments.sample.php file that ships with spitfire, so people using the framework and configuring their application by duplicating the sample environments file that ships with the repository will have a good base setup for their needs. In turn, people who do not wish to expose these endpoints, or modify them reliably, can do so without explicit overrides.

cesar added a revision: Restricted Differential Revision.Nov 11 2019, 1:41 PM
cesar added a comment.Dec 10 2019, 8:19 PM

I am planning on revamping the entire storage system in Spitfire. I'm not happy with it for a number of reasons:

  • It supports folders. This is a major inconvenience when writing code, it makes the behavior much more verbose, and seems to provide little advantage to applications that just wish to have a place to dump files to. Folders mean that applications taking advantage of the storage system need to constantly check whether they are operating against a folder, making the code to write a simple file really verbose in comparison.
  • It's really verbose. When you compare it with other blob storage systems, it produces really bloated code.
  • Storage configuration cannot be easily made from outside the system. I'd love to be able to have a central server that is capable of orchestrating the behavior of others on the same network, but right now, the unwieldy configuration makes it hard to place it anywhere.

The idea is to do the following:

  • Introduce a DSN for driver configuration. With a single URL, the application can receive all the parameters needed to configure the application.
  • Make the configuration so lean that it can be placed inside the environments. Making the configuration of spitfire apps almost trivial.
  • Remove folder support from the object storage. This would make it a driver setting whether paths with / are allowed. It should be allowed for applications that need it and legacy code, but it should be discouraged.

The code in future revisions should look similar to the old one, but become a bit leaner and more obvious. Something like this would be the target:

#Reading and writing a file
storage()->disk('uploads')->file('test.txt')->put('Hello world');
storage()->retrieve('uploads://test.txt')->read(); //Should return Hello world

#Buffered writer
$out = storage()->disk('cache')->file(time() . '.cache')->buffer()->writer();

request('https://example.com')
    ->stream(function ($payload) use ($out) { 
         return $out->write($payload); 
     })->send();

#With PHP 7.4, this should be writable as - which makes it pretty sexy IMO
request('https://example.com')->stream(fn($payload) => $out->write($payload))->send();
cesar added a parent task: Unknown Object (Maniphest Task).Dec 16 2019, 9:51 AM

Our environment configuration for the individual storage systems should look a little like this:

$e = new \spitfire\core\Environment();

$e->set('storage.engines', ['uploads', 'thumbs', 'sessions']);

# A simple example using files to store the data
$e->set('storage.engines.uploads', ['\spitfire\storage\drive\Mount', 'file://./bin/usr/uploads']);

# An example using something like Cloudy to store the data
$e->set('SSO', 'https://appid:appsecret@accounts.myapp.com/');
$e->set('storage.engines.thumbs', ['\cloudy\SFMount', 'https://appid@cloudynas.net?sso=@SSO&bucket=uuid']);


#An example using something like TMPFS or Memcached, which are not persistent, but are sufficient for something like sessions
# The second parameter here would be the prefix used
$e->set('storage.engines.session', ['\spitfire\storage\memcached\Mount', 'session_']);

I'd like to consider the option of removing support for arrays inside environments and instead use JSON or by using a syntax like classname:DSN (like `\spitfire\storage\drive\Mount:file:///') for this cause. This way it'd be much simpler to broadcast the changes from a centralized hub with fabric.

In this case, the name of the mount point is provided by the fact that it's in the key. What we need though is an Environment::list method that allows using these keys we have in the example and generates something like:

{
  uploads: ["\spitfire\storage\drive\Mount", "file://./bin/usr/uploads"],
  thumbs: ["\cloudy\SFMount", "https://appid@cloudynas.net?sso=@SSO&bucket=uuid"],
  ...
}
cesar added a comment.Jan 6 2020, 12:14 PM

In order for the new storage system to work, the Driver for every blob storage needs to provide the following functions:

  • read: retrieves a blob from the collection
  • write: Writes the data provided to the storage and returns the path the storage driver can use to retrieve the blob.
  • contains / exists: returns whether the collection contains a blob for the index
  • mime: returns the mime-type of the blob
  • mtime: returns the time the blob was last modified
  • atime: returns the time the blob was last accessed
  • delete: removes the blob from the collection
  • url (ttl): returns the public URL of the object
  • stream: Returns a IOStream Object , throws IOException

The IOStream object just has two methods:

  • reader: returns an object that allows scanning the content of the blob
  • writer: returns an object providing interfaces to write to the stream

Just as a renewed clarification. The blob system is not technically a file-system. I do not intend it to be that. It's intended to provide an abstraction for systems that do require object storage as opposed to file storage. In my experience, this is most of the dealings of web-applications. This is true for:

  • File uploads
  • Session
  • Caches

The application itself does not wish to deal with the specifics of where the file, session, cache information is stored. The application wishes to place a piece of data somewhere so it can be retrieved later. This is what we intend to provide.