parseURL( $url ); $ret = $this->fetchBlob( $cluster, $id, $itemID ); if ( $itemID !== false && $ret !== false ) { return $ret->getItem( $itemID ); } return $ret; } /** * Fetch data from given external store URLs. * The provided URLs are in the form of DB://cluster/id * or DB://cluster/id/itemid for concatened storage. * * @param array $urls An array of external store URLs * @return array A map from url to stored content. Failed results * are not represented. */ public function batchFetchFromURLs( array $urls ) { $batched = $inverseUrlMap = []; foreach ( $urls as $url ) { list( $cluster, $id, $itemID ) = $this->parseURL( $url ); $batched[$cluster][$id][] = $itemID; // false $itemID gets cast to int, but should be ok // since we do === from the $itemID in $batched $inverseUrlMap[$cluster][$id][$itemID] = $url; } $ret = []; foreach ( $batched as $cluster => $batchByCluster ) { $res = $this->batchFetchBlobs( $cluster, $batchByCluster ); /** @var HistoryBlob $blob */ foreach ( $res as $id => $blob ) { foreach ( $batchByCluster[$id] as $itemID ) { $url = $inverseUrlMap[$cluster][$id][$itemID]; if ( $itemID === false ) { $ret[$url] = $blob; } else { $ret[$url] = $blob->getItem( $itemID ); } } } } return $ret; } public function store( $location, $data ) { $dbw = $this->getMaster( $location ); $dbw->insert( $this->getTable( $dbw ), [ 'blob_text' => $data ], __METHOD__ ); $id = $dbw->insertId(); if ( !$id ) { throw new MWException( __METHOD__ . ': no insert ID' ); } return "DB://$location/$id"; } public function isReadOnly( $location ) { return ( $this->getLoadBalancer( $location )->getReadOnlyReason() !== false ); } /** * Get a LoadBalancer for the specified cluster * * @param string $cluster Cluster name * @return ILoadBalancer */ private function getLoadBalancer( $cluster ) { $lbFactory = MediaWikiServices::getInstance()->getDBLoadBalancerFactory(); return $lbFactory->getExternalLB( $cluster ); } /** * Get a replica DB connection for the specified cluster * * @param string $cluster Cluster name * @return DBConnRef */ public function getSlave( $cluster ) { global $wgDefaultExternalStore; $lb = $this->getLoadBalancer( $cluster ); $domainId = $this->getDomainId( $lb->getServerInfo( $lb->getWriterIndex() ) ); if ( !in_array( "DB://" . $cluster, (array)$wgDefaultExternalStore ) ) { wfDebug( "read only external store\n" ); $lb->allowLagged( true ); } else { wfDebug( "writable external store\n" ); } $db = $lb->getConnectionRef( DB_REPLICA, [], $domainId ); $db->clearFlag( DBO_TRX ); // sanity return $db; } /** * Get a master database connection for the specified cluster * * @param string $cluster Cluster name * @return MaintainableDBConnRef */ public function getMaster( $cluster ) { $lb = $this->getLoadBalancer( $cluster ); $domainId = $this->getDomainId( $lb->getServerInfo( $lb->getWriterIndex() ) ); $db = $lb->getMaintenanceConnectionRef( DB_MASTER, [], $domainId ); $db->clearFlag( DBO_TRX ); // sanity return $db; } /** * @param array $server Master DB server configuration array for LoadBalancer * @return string|bool Database domain ID or false */ private function getDomainId( array $server ) { if ( isset( $this->params['wiki'] ) ) { return $this->params['wiki']; // explicit domain } if ( isset( $server['dbname'] ) ) { // T200471: for b/c, treat any "dbname" field as forcing which database to use. // MediaWiki/LoadBalancer previously did not enforce any concept of a local DB // domain, but rather assumed that the LB server configuration matched $wgDBname. // This check is useful when the external storage DB for this cluster does not use // the same name as the corresponding "main" DB(s) for wikis. $domain = new DatabaseDomain( $server['dbname'], $server['schema'] ?? null, $server['tablePrefix'] ?? '' ); return $domain->getId(); } return false; // local LB domain } /** * Get the 'blobs' table name for this database * * @param IDatabase $db * @return string Table name ('blobs' by default) */ public function getTable( $db ) { $table = $db->getLBInfo( 'blobs table' ); if ( is_null( $table ) ) { $table = 'blobs'; } return $table; } /** * Fetch a blob item out of the database; a cache of the last-loaded * blob will be kept so that multiple loads out of a multi-item blob * can avoid redundant database access and decompression. * @param string $cluster * @param string $id * @param string $itemID * @return HistoryBlob|bool Returns false if missing */ private function fetchBlob( $cluster, $id, $itemID ) { /** * One-step cache variable to hold base blobs; operations that * pull multiple revisions may often pull multiple times from * the same blob. By keeping the last-used one open, we avoid * redundant unserialization and decompression overhead. */ static $externalBlobCache = []; $cacheID = ( $itemID === false ) ? "$cluster/$id" : "$cluster/$id/"; if ( isset( $externalBlobCache[$cacheID] ) ) { wfDebugLog( 'ExternalStoreDB-cache', "ExternalStoreDB::fetchBlob cache hit on $cacheID" ); return $externalBlobCache[$cacheID]; } wfDebugLog( 'ExternalStoreDB-cache', "ExternalStoreDB::fetchBlob cache miss on $cacheID" ); $dbr = $this->getSlave( $cluster ); $ret = $dbr->selectField( $this->getTable( $dbr ), 'blob_text', [ 'blob_id' => $id ], __METHOD__ ); if ( $ret === false ) { wfDebugLog( 'ExternalStoreDB', "ExternalStoreDB::fetchBlob master fallback on $cacheID" ); // Try the master $dbw = $this->getMaster( $cluster ); $ret = $dbw->selectField( $this->getTable( $dbw ), 'blob_text', [ 'blob_id' => $id ], __METHOD__ ); if ( $ret === false ) { wfDebugLog( 'ExternalStoreDB', "ExternalStoreDB::fetchBlob master failed to find $cacheID" ); } } if ( $itemID !== false && $ret !== false ) { // Unserialise object; caller extracts item $ret = unserialize( $ret ); } $externalBlobCache = [ $cacheID => $ret ]; return $ret; } /** * Fetch multiple blob items out of the database * * @param string $cluster A cluster name valid for use with LBFactory * @param array $ids A map from the blob_id's to look for to the requested itemIDs in the blobs * @return array A map from the blob_id's requested to their content. * Unlocated ids are not represented */ private function batchFetchBlobs( $cluster, array $ids ) { $dbr = $this->getSlave( $cluster ); $res = $dbr->select( $this->getTable( $dbr ), [ 'blob_id', 'blob_text' ], [ 'blob_id' => array_keys( $ids ) ], __METHOD__ ); $ret = []; if ( $res !== false ) { $this->mergeBatchResult( $ret, $ids, $res ); } if ( $ids ) { wfDebugLog( __CLASS__, __METHOD__ . " master fallback on '$cluster' for: " . implode( ',', array_keys( $ids ) ) ); // Try the master $dbw = $this->getMaster( $cluster ); $res = $dbw->select( $this->getTable( $dbr ), [ 'blob_id', 'blob_text' ], [ 'blob_id' => array_keys( $ids ) ], __METHOD__ ); if ( $res === false ) { wfDebugLog( __CLASS__, __METHOD__ . " master failed on '$cluster'" ); } else { $this->mergeBatchResult( $ret, $ids, $res ); } } if ( $ids ) { wfDebugLog( __CLASS__, __METHOD__ . " master on '$cluster' failed locating items: " . implode( ',', array_keys( $ids ) ) ); } return $ret; } /** * Helper function for self::batchFetchBlobs for merging master/replica DB results * @param array &$ret Current self::batchFetchBlobs return value * @param array &$ids Map from blob_id to requested itemIDs * @param mixed $res DB result from Database::select */ private function mergeBatchResult( array &$ret, array &$ids, $res ) { foreach ( $res as $row ) { $id = $row->blob_id; $itemIDs = $ids[$id]; unset( $ids[$id] ); // to track if everything is found if ( count( $itemIDs ) === 1 && reset( $itemIDs ) === false ) { // single result stored per blob $ret[$id] = $row->blob_text; } else { // multi result stored per blob $ret[$id] = unserialize( $row->blob_text ); } } } /** * @param string $url * @return array */ protected function parseURL( $url ) { $path = explode( '/', $url ); return [ $path[2], // cluster $path[3], // id isset( $path[4] ) ? $path[4] : false // itemID ]; } }