Skip to content

Commit

Permalink
[hotfix][3.6.3] Merging back hotfix into finney (#1065)
Browse files Browse the repository at this point in the history
* Update README.md

* Hotfix/3.6.2/validator logit parameters (#1057)

* additional parameters

* fixed naming to logit divergence

* versioning and fixes

* typo fixes

* bug fixes

* Tests cli fixes (#1058)

* fix btcli list with wallet.path (#1036)

fix path join

* remove mock subtensor and replace with mock calls

* additional fixes

* mock wallet

Co-authored-by: Cameron Fairchild <[email protected]>

* Log prune_len and logits_divergence

* Always get latest prune_len

Co-authored-by: Cameron Fairchild <[email protected]>
Co-authored-by: opentaco <[email protected]>

* fixing no_version_checking error

* updating version to 3.6.3

---------

Co-authored-by: Unconst <[email protected]>
Co-authored-by: Eugene-hu <[email protected]>
Co-authored-by: Cameron Fairchild <[email protected]>
Co-authored-by: opentaco <[email protected]>
Co-authored-by: Eugene <[email protected]>
  • Loading branch information
6 people authored Feb 2, 2023
1 parent c04403c commit ceb29a2
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 21 deletions.
16 changes: 10 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,13 @@

</div>

At Bittensor, we are creating an open, decentralized, peer-to-peer network that functions as a market system for the development of artificial intelligence. Our purpose is not only to accelerate the development of AI by creating an environment optimally condusive to its evolution, but to democratize the global production and use of this valuable commodity. Our aim is to disrupt the status quo: a system that is centrally controlled, inefficient and unsustainable. In developing the Bittensor API, we are allowing engineers to monetize their work, gain access to machine intelligence and join our community of creative, forward-thinking individuals. For more info, read our [paper](https://drive.google.com/file/d/1VnsobL6lIAAqcA1_Tbm8AYIQscfJV4KU/view).
This repository contains Bittensor's python API which can be used to 1) Query the Bittensor network as a [client](#31-client) 2) Run and build Bittensor miners & validators for [mining TAO](#43-running-a-template-miner), 3) Pull network [state information](#3-using-bittensor) and 4) Manage [TAO wallets](#41-cli), balances, transfers etc.

Bittensor is a mining network (like Bitcoin) with inbaked incentives which are designed to drive miners to provide value; which, in our network, is achieved by hosting trained or training machine learning models, which can be queried by clients seeking inference over inputs (i.e. text-generation, or numerical embeddings from a large foundation model like GPT-NeoX-20B).

The use of token based incentives is by design, built-in to drive the network's size and as a means of distributing the value generated by the network directly to the individuals producing that value without intermediary. The network is open to those who participate and no individual or group has full power of what it learns, who can profit from it, or access it.

To learn more about Bittensor read our [paper].(https://drive.google.com/file/d/1VnsobL6lIAAqcA1_Tbm8AYIQscfJV4KU/view).

- [1. Documentation](#1-documentation)
- [2. Install](#2-install)
Expand All @@ -26,11 +32,9 @@ At Bittensor, we are creating an open, decentralized, peer-to-peer network that
- [4.2. Selecting the network to join](#42-selecting-the-network-to-join)
- [4.3. Running a template miner](#43-running-a-template-miner)
- [4.4. Running a template server](#44-running-a-template-server)
- [4.5. Subscription to the network](#45-subscription-to-the-network)
- [4.6. Syncing with the chain/ Finding the ranks/stake/uids of other nodes](#46-syncing-with-the-chain-finding-the-ranksstakeuids-of-other-nodes)
- [4.7. Finding and creating the endpoints for other nodes in the network](#47-finding-and-creating-the-endpoints-for-other-nodes-in-the-network)
- [4.8. Querying others in the network](#48-querying-others-in-the-network)
- [4.9. Creating a Priority Thread Pool for the axon](#49-creating-a-priority-thread-pool-for-the-axon)
- [4.5. Syncing with the chain/ Finding the ranks/stake/uids of other nodes](#46-syncing-with-the-chain-finding-the-ranksstakeuids-of-other-nodes)
- [4.6. Finding and creating the endpoints for other nodes in the network](#47-finding-and-creating-the-endpoints-for-other-nodes-in-the-network)
- [4.7. Querying others in the network](#48-querying-others-in-the-network)
- [5. Release](#5-release)
- [6. License](#6-license)
- [7. Acknowledgments](#7-acknowledgments)
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.6.1
3.6.3
39 changes: 25 additions & 14 deletions bittensor/_neuron/text/core_validator/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ def __init__(
self.device = torch.device ( device = self.config.neuron.device )
self.nucleus = nucleus ( config = self.config, device = self.device, subtensor = self.subtensor, vlogger = self.vlogger ).to( self.device )
self.dataset = (bittensor.dataset(config=self.config, batch_size=self.subtensor.validator_batch_size(self.config.netuid),
block_size=self.subtensor.validator_sequence_length(self.config.netuid) + self.config.neuron.validation_len + self.config.neuron.prune_len)
block_size=self.subtensor.validator_sequence_length(self.config.netuid) + self.config.neuron.validation_len + self.subtensor.validator_prune_len(netuid=self.config.netuid))
if dataset is None else dataset)
self.optimizer = torch.optim.SGD(
self.nucleus.parameters(), lr=self.config.neuron.learning_rate, momentum=self.config.neuron.momentum
Expand Down Expand Up @@ -205,7 +205,7 @@ def add_args( cls, parser ):
parser.add_argument('--neuron.blocks_per_epoch', type=int, help='Blocks per epoch, -1 value means we use the chain value.', default = -1 )
parser.add_argument('--neuron.epochs_until_reset', type=int, help='Number of epochs before weights are reset.', default = -1 )
parser.add_argument('--neuron.validation_len', type=int, help='Number of tokens to holdout for phrase validation beyond sequence context.', default=8)
parser.add_argument('--neuron.prune_len', type=int, help='Number of tokens to prune from each validation input sequence.', default=1)
parser.add_argument('--neuron.prune_len', type=int, help='Number of tokens to prune from each validation input sequence. (default value: -1, pulling from subtensor directly)', default=-1)
parser.add_argument('--neuron.device', type=str, help='miner default training device cpu/cuda', default=("cuda" if torch.cuda.is_available() else "cpu"))
parser.add_argument('--neuron.clip_gradients', type=float, help='Implement gradient clipping to avoid exploding loss on smaller architectures.', default=1.0 )
parser.add_argument('--neuron.track_hotkey_changes', action='store_true', help='If True, track hotkey changes.', default=False)
Expand Down Expand Up @@ -375,7 +375,9 @@ def run_epoch( self ):
batch_size = self.subtensor.validator_batch_size(netuid=self.config.netuid)
sequence_length = self.subtensor.validator_sequence_length(netuid=self.config.netuid)
validation_len = self.config.neuron.validation_len # Number of tokens to holdout for phrase validation beyond sequence context
prune_len = self.config.neuron.prune_len # Number of tokens to holdout for phrase validation beyond sequence context
# Number of tokens to prune for phrase validation beyond sequence context
prune_len = self.config.neuron.prune_len = self.subtensor.validator_prune_len(netuid=self.config.netuid)
self.config.nucleus.logits_divergence = self.subtensor.validator_logits_divergence(netuid=self.config.netuid)
min_allowed_weights = self.subtensor.min_allowed_weights(netuid=self.config.netuid)
max_weight_limit = self.subtensor.max_weight_limit(netuid=self.config.netuid)
blocks_per_epoch = self.subtensor.validator_epoch_length(netuid=self.config.netuid) if self.config.neuron.blocks_per_epoch == -1 else self.config.neuron.blocks_per_epoch
Expand Down Expand Up @@ -657,7 +659,7 @@ def neuron_stats_update(self, neuron_stats: Dict[int, Dict[str, Any]]):

if 'logits_excess_nxt' in stats:
# penalize by logits divergence excess
extra_stats['shapley_values_nxt'] /= 1 + stats['logits_excess_nxt']
extra_stats['shapley_values_nxt'] /= 1 + self.config.nucleus.logits_divergence * stats['logits_excess_nxt']

# === EMA zeroing update ===
# Push zero into EMA for synapse_keys to exponentially decay weighting keys if neuron non-responsive
Expand Down Expand Up @@ -750,6 +752,7 @@ def __init__( self, config, device, subtensor, vlogger ):
super(nucleus, self).__init__()
self.config = config
self.vlogger = vlogger
self.config.nucleus.logits_divergence = subtensor.validator_logits_divergence(netuid=self.config.netuid) if self.config.nucleus.logits_divergence == -1 else self.config.nucleus.logits_divergence
self.config.nucleus.scaling_law_power = subtensor.scaling_law_power(netuid=self.config.netuid) if self.config.nucleus.scaling_law_power == -1 else self.config.nucleus.scaling_law_power
self.config.nucleus.synergy_scaling_law_power = subtensor.synergy_scaling_law_power(netuid=self.config.netuid) if self.config.nucleus.synergy_scaling_law_power == -1 else self.config.nucleus.synergy_scaling_law_power

Expand Down Expand Up @@ -799,6 +802,7 @@ def add_args( cls, parser ):
parser.add_argument('--nucleus.no_dendrite_backward', action='store_true', help='Pass backward request to the server side or not', default=False )
parser.add_argument('--nucleus.scaling_law_power', type=float, help='Power for modified scaling law, powered down to improve dynamic range, e.g. 3 → 6 nats for 0.5. (default value: -1, pulling from subtensor directly)', default=-1)
parser.add_argument('--nucleus.synergy_scaling_law_power', type=float, help='Power for synergy modified scaling law, powered down to improve dynamic range, e.g. 3 → 6 nats for 0.5. (default value: -1, pulling from subtensor directly)', default=-1)
parser.add_argument('--nucleus.logits_divergence', type=float, help=' the divergence value for logit anomaly detection (default value: -1, pulling from subtensor directly)', default=-1)

@classmethod
def config ( cls ):
Expand Down Expand Up @@ -910,7 +914,7 @@ def forward(
num_endpoints = len(random_endpoints) # in case len(self.permute_uids) < num_endpoints during random_uids select

logger.info(f'Forward \t| Routing forward <dim>[{time.time() - start_time:.3g}s]</dim>')
logger.info(f'Dendrite \t| Request {num_endpoints} x {list(inputs_seq.shape)}')
logger.info(f'Dendrite \t| Request {num_endpoints} x {list(inputs_seq.shape)} (prune_len={prune_len})')
request_start_time = time.time()

# === Define which synapse we want to use ===
Expand Down Expand Up @@ -951,6 +955,7 @@ def forward(
f'<dim>[{time.time() - request_start_time:.3g}s]</dim>')

# === Prepare validation parameter set ===
console_width = self.config.get('width', None) # console width for rich table displays of synapse measures
validation_params = {
'uids': random_uids,
'query_responses': query_responses,
Expand All @@ -960,6 +965,7 @@ def forward(
'inputs': inputs,
'validation_len': val_len,
'loss_fct': self.loss_fct,
'logits_divergence_penalty':self.config.nucleus.logits_divergence,
'scaling_law_power': self.config.nucleus.scaling_law_power,
'synergy_scaling_law_power': self.config.nucleus.synergy_scaling_law_power,
'vlogger': self.vlogger,
Expand Down Expand Up @@ -991,9 +997,9 @@ def scaling_law_loss_to_params(loss):

def textcausallm(uids: torch.Tensor, query_responses: List[List[torch.FloatTensor]], return_ops: List[torch.LongTensor],
times: List[torch.FloatTensor], routing_score: torch.FloatTensor,
inputs: torch.FloatTensor, validation_len: int, loss_fct: Callable,
inputs: torch.FloatTensor, validation_len: int, loss_fct: Callable,
scaling_law_power: float, synergy_scaling_law_power: float, vlogger: ValidatorLogger,
logging, synapse: 'bittensor.TextCausalLM' = None, index_s: int = 0
logits_divergence_penalty: float,logging, synapse: 'bittensor.TextCausalLM' = None, index_s: int = 0
) -> Tuple[torch.FloatTensor, Dict]:
r"""
Calculate Shapley values and neuron response validation measure statistics, given TextCausalLM synapse responses.
Expand All @@ -1019,6 +1025,8 @@ def textcausallm(uids: torch.Tensor, query_responses: List[List[torch.FloatTenso
Power for modified scaling law, powered down to improve dynamic range, e.g. 3 → 6 nats for 0.5.
synergy_scaling_law_power (:obj:`float`, `required`):
Power for synergy modified scaling law, powered down to improve dynamic range, e.g. 3 → 6 nats for 0.5.
logits_divergence_penalty (:obj:`float`, `required`):
Penalty scaling for logits divergence.
vlogger (:obj:`ValidatorLogger`, `required`):
Logger for validator.
logging (:obj:`bool`, `required`):
Expand Down Expand Up @@ -1069,7 +1077,7 @@ def _synergy(first, second, target, _ext):
loss, stats, unsuccessful = shapley_base(uids, query_responses, return_ops, times, routing_score,
_base_params, index_s, ext='')

logger.info(f'{str(synapse)} \t| Shapley base values (power={scaling_law_power:.1f})'
logger.info(f'{str(synapse)} \t| Shapley base values (power={scaling_law_power:.1f}) '
f'<dim>[{time.time() - shapley_start_time:.3g}s]</dim>')

synergy_start_time = time.time()
Expand All @@ -1096,7 +1104,7 @@ def _synergy(first, second, target, _ext):
if hasattr(s[key], 'item'):
s[key] = s[key].item()

logger.info(f'{str(synapse)} \t| Shapley synergy values (power={synergy_scaling_law_power:.1f})'
logger.info(f'{str(synapse)} \t| Shapley synergy values (power={synergy_scaling_law_power:.1f}) '
f'<dim>[{time.time() - synergy_start_time:.3g}s]</dim>')

if logging:
Expand All @@ -1117,9 +1125,9 @@ def _synergy(first, second, target, _ext):

def textcausallmnext(uids: torch.Tensor, query_responses: List[List[torch.FloatTensor]], return_ops: List[torch.LongTensor],
times: List[torch.FloatTensor], routing_score: torch.FloatTensor,
inputs: torch.FloatTensor, validation_len: int, loss_fct: Callable,
inputs: torch.FloatTensor, validation_len: int, loss_fct: Callable,
scaling_law_power: float, synergy_scaling_law_power: float, vlogger:ValidatorLogger,
logging, synapse: 'bittensor.TextCausalLMNext' = None, index_s: int = 0
logits_divergence_penalty: float,logging, synapse: 'bittensor.TextCausalLMNext' = None, index_s: int = 0
) -> Tuple[torch.FloatTensor, Dict]:
r"""
Calculate Shapley values and neuron response validation measure statistics, given TextCausalLMNext synapse responses.
Expand All @@ -1145,6 +1153,8 @@ def textcausallmnext(uids: torch.Tensor, query_responses: List[List[torch.FloatT
Power for modified scaling law, powered down to improve dynamic range, e.g. 3 → 6 nats for 0.5.
synergy_scaling_law_power (:obj:`float`, `required`):
Power for synergy modified scaling law, powered down to improve dynamic range, e.g. 3 → 6 nats for 0.5.
logits_divergence_penalty (:obj:`float`, `required`):
Penalty scaling for logits divergence.
vlogger (:obj:`ValidatorLogger`, `required`):
Logger for validator.
logging (:obj:`bool`, `required`):
Expand Down Expand Up @@ -1183,17 +1193,18 @@ def _synergy(first, second, target, ext):
shapley_start_time = time.time()
loss, stats, unsuccessful = shapley_base(uids, query_responses, return_ops, times, routing_score,
_base_params, index_s, ext='_nxt')
logger.info(f'{str(synapse)} \t| Shapley base values (power={scaling_law_power:.1f})'
logger.info(f'{str(synapse)} \t| Shapley base values (power={scaling_law_power:.1f}) '
f'<dim>[{time.time() - shapley_start_time:.3g}s]</dim>')

divergence_start_time = time.time()
with torch.no_grad():
logits_divergence(stats, uids, query_responses, return_ops, times, index_s, ext='_nxt')
logger.info(f'{str(synapse)} \t| Logits divergences <dim>[{time.time() - divergence_start_time:.3g}s]</dim>')
logger.info(f'{str(synapse)} \t| Logits divergences (penalty={logits_divergence_penalty}) '
f'<dim>[{time.time() - divergence_start_time:.3g}s]</dim>')

synergy_start_time = time.time()
syn_loss_diff = shapley_synergy(stats, _synergy, '_nxt', scaling_law_power=synergy_scaling_law_power)
logger.info(f'{str(synapse)} \t| Shapley synergy values (power={synergy_scaling_law_power:.1f})'
logger.info(f'{str(synapse)} \t| Shapley synergy values (power={synergy_scaling_law_power:.1f}) '
f'<dim>[{time.time() - synergy_start_time:.3g}s]</dim>')

# === Shapley value combination ===
Expand Down
10 changes: 10 additions & 0 deletions bittensor/_subtensor/subtensor_impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,16 @@ def validator_batch_size (self, netuid: int, block: Optional[int] = None ) -> Op
if not self.subnet_exists( netuid ): return None
return self.query_paratensor("ValidatorBatchSize", block, [netuid] ).value

""" Returns network ValidatorPruneLen hyper parameter """
def validator_prune_len (self, netuid: int, block: Optional[int] = None ) -> int:
if not self.subnet_exists( netuid ): return None
return self.query_paratensor("ValidatorPruneLen", block, [netuid] ).value

""" Returns network ValidatorLogitsDivergence hyper parameter """
def validator_logits_divergence (self, netuid: int, block: Optional[int] = None ) -> int:
if not self.subnet_exists( netuid ): return None
return self.query_paratensor("ValidatorLogitsDivergence", block, [netuid] ).value/U64_MAX

""" Returns network ValidatorSequenceLength hyper parameter """
def validator_sequence_length (self, netuid: int, block: Optional[int] = None ) -> Optional[int]:
if not self.subnet_exists( netuid ): return None
Expand Down

0 comments on commit ceb29a2

Please sign in to comment.